Cell and Molecular Biology. Concepts and Experiments [7 ed.]

1,785 209 174MB

English Pages [874] Year 2013

Table of contents :
Cover
Title Page
Copyright page
About the Author
Preface
Acknowledgments
Contents
Chapter 1: Introduction to the Study of Cell and Molecular Biology
1.1 The Discovery of Cells
1.2 Basic Properties of Cells
Cells Are Highly Complex and Organized
Cells Possess a Generic Program and the Means to Use It
Cells Are Capable of Producing More of Themselves
Cells Acquire and Utilize Energy
Cells Carry Out a Variety of Chemical Reactions
Cells Engage in Mechanical Activities
Cells Are Able to Respond to Stimuli
Cells Are Capable of Self-Regulation
Cells Evolve
1.3 Two Fundamentally Different Classes of Cells
Characteristics That Distinguish Prokaryotic and Eukaryotic Cells
Types of Prokaryotic Cells
Types of Eukaryotic Cells: Cell Specialization
The Sizes of Cells and Their Components
Synthetic Biology
THE HUMAN PERSPECTIVE: The Prospect of Cell Replacement Therapy
1.4 Viruses
Viroids
EXPERIMENTAL PATHWAYS: The Origin of Eukaryotic Cells
Chapter 2: The Chemical Basis of Life
2.1 Covalent Bonds
Polar and Nonpolar Molecules
Ionizaton
2.2 Noncovalent Bonds
THE HUMAN PERSPECTIVE: Free Radicals as a Cause of Aging
Ionic Bonds: Attractions between Charged Atoms
Hydrogen Bonds
Hydrophobic Interactions and van der Waals Forces
The Life-Supporting Properties of Water
2.3 Acids, Bases, and Buffers
2.4 The Nature of Biological Molecules
Functional Groups
A Classification of Biological Molecules by Function
2.5 Four Types of Biological Molecules
Carbohydrates
Lipids
Proteins
THE HUMAN PERSPECTIVE: Protein Misfolding Can Have Deadly Consequences
Nucleic Acids
2.6 The Formation of Complex Macromolecular Structures
The Assembly of Tobacco Mosaic Virus Particles and Ribosomal Subunits
EXPERIMENTAL PATHWAYS: Chaperones: Helping Proteins Reach Their Proper Folded State
Chapter 3: Bioenergetics, Enzymes, and Metabolism
3.1 Bioenergetics
The Laws of Thermodynamics and the Concept of Entropy
Free Energy
3.2 Enzymes as Biological Catalysts
The Properties of Enzymes
Overcoming the Activation Energy Barrier
The Active Site
Mechanisms of Enzyme Catalysis
Enzyme Kinetics
THE HUMAN PERSPECTIVE: The Growing Problem of Antibiotic Resistance
3.3 Metabolism
An Overview of Metabolism
Oxidation and Reduction: A Matter of Electrons
The Capture and Utilization of Energy
Metabolic Regulation
Chapter 4: The Structure and Function of the Plasma Membrane
4.1 An Overview of Membrane Functions
4.2 A Brief History of Studies on Plasma Membrane Structure
4.3 The Chemical Composition of Membranes
Membrane Lipids
The Asymmetry of Membrane Lipids
Membrane Carbohydrates
4.4 The Structure and Functions of Membrane Proteins
Integral Membrane Proteins
Studying the Structure and Properties of Integral Membrane Proteins
Peripheral Membrane Proteins
Lipid-Anchored Membrane Proteins
4.5 Membrane Lipids and Membrane Fluidity
The Importance of Membrane Fluidity
Maintaining Membrane Fluidity
Lipid Rafts
4.6 The Dynamic Nature of the Plasma Membrane
The Diffusion of Membrane Proteins after Cell Fusion
Restrictions on Protein and Lipid Mobility
The Red Blood Cell: An Example of Plasma Membrane Structure
4.7 The Movement of Substances Across Cell Membranes
The Energetics of Solute Movement
Diffusion of Substances through Membranes
Facilitated Diffusion
Active Transport
THE HUMAN PERSPECTIVE: Defects in Ion Channels and Transporters as a Cause of Inherited Disease
4.8 Membrane Potentials and Nerve Impulses
The Resting Potential
The Action Potential
Propagation of Action Potentials as an Impulse
Neurotransmission: Jumping the Synaptic Cleft
EXPERIMENTAL PATHWAYS: The Acetylcholine Receptor
Chapter 5: Aerobic Respiration and the Mitochondrion
5.1 Mitochondrial Structure and Function
Mitochondrial Membranes
The Mitochondrial Matrix
5.2 Oxidative Metabolism in the Mitochondrion
The Tricarboxylic Acid (TCA) Cycle
The Importance of Reduced Coenzymes in the Formation of ATP
THE HUMAN PERSPECTIVE: The Role of Anaerobic and Aerobic Metabolism in Exercise
5.3 The Role of Mitochondria in the Formation of ATP
Oxidation-Reduction Potentials
Electron Transport
Types of Electron Carriers
5.4 Translocation of Protons and the Establishment of a Proton-Motive Force
5.5 The Machinery for ATP Formation
The Structure of ATP Synthase
The Basis of ATP Formation According to the Binding Change Mechanism
Other Roles for the Proton-Motive Force in Addition to ATP Synthesis
5.6 Peroxisomes
THE HUMAN PERSPECTIVE: Diseases that Result from Abnormal Mitochondrial or Peroxisomal Function
Chapter 6: Photosynthesis and the Chloroplast
6.1 Chloroplast Structure and Function
6.2 An Overview of Photosynthetic Metabolism
6.3 The Absorption of Light
Photosynthetic Pigments
6.4 Photosynthetic Units and Reaction Centers
Oxygen Formation: Coordinating the Action of Two Different Photosynthetic Systems
Killing Weeds by Inhibiting Electron Transport
6.5 Photophosphorylation
Noncyclic Versus Cyclic Photophosphorylation
6.6 Carbon Dioxide Fixation and the Synthesis of Carbohydrate
Carbohydrate Synthesis in C3 Plants
Carbohydrate Synthesis in C4 Plants
Carbohydrate Synthesis in CAM Plants
Chapter 7: Interactions Between Cells and Their Environment
7.1 The Extracellular Space
The Extracellular Matrix
7.2 Interactions of Cells with Extracellular Materials
Integrins
Focal Adhesions and Hemidesmosomes: Anchoring Cells to Their Substratum
7.3 Interactions of Cells with Other Cells
Selectins
The Immunoglobulin Superfamily
Cadherins
• THE HUMAN PERSPECTIVE: The Role of Cell Adhesion in Inflammation and Metastasis
Adherens Junctions and Desmosomes: Anchoring Cells to Other Cells
The Role of Cell-Adhesion Receptors in Transmembrane Signaling
7.4 Tight Junctions: Sealing The Extracellular Space
7.5 Gap Junctions and Plasmodesmata: Mediating Intercellular Communication
Plasmodesmata
7.6 Cell Walls
Chapter 8: Cytoplasmic Membrane Systems: Structure, Function, and Membrane Trafficking
8.1 An Overview of the Endomembrane System
8.2 A Few Approaches to the Study of Endomembranes
Insights Gained from Autoradiography
Insights Gained from the Use of the Green Fluorescent Protein
Insights Gained from the Biochemical Analysis of Subcellular Fractions
Insights Gained from the Use of Cell-Free Systems
Insights Gained from the Study of Mutant Phenotypes
8.3 The Endoplasmic Reticulum
The Smooth Endoplasmic Reticulum
Functions of the Rough Endoplasmic Reticulum
From the ER to the Golgi Complex: The First Step in Vesicular Transport
8.4 The Golgi Complex
Glycosylation in the Golgi Complex
The Movement of Materials through the Golgi Complex
8.5 Types of Vesicle Transport and Their Functions
COPII-Coated Vesicles: Transporting Cargo from the ER to the Golgi Complex
COPI-Coated Vesicles: Transporting Escaped Proteins Back to the ER
Beyond the Golgi Complex: Sorting Proteins at the TGN
Targeting Vesicles to a Particular Compartment
8.6 Lysosomes
Autophagy
THE HUMAN PERSPECTIVE: Disorders Resulting from Defects in Lysosomal Function
8.7 Plant Cell Vacuoles
8.8 The Endocytic Pathway: Moving Membrane and Materials into the Cell Interior
Endocytosis
Phagocytosis
8.9 Posttranslational Uptake of Proteins by Peroxisomes, Mitochondria, and Chloroplasts
Uptake of Proteins into Peroxisomes
Uptake of Proteins into Mitochondria
Uptake of Proteins into Chloroplasts
EXPERIMENTAL PATHWAYS: Receptor-Mediated Endocytosis
Chapter 9: The Cytoskeleton and Cell Motility
9.1 Overview of the Major Functions of the Cytoskeleton
9.2 The Study of the Cytoskeleton
The Use of Live-Cell Fluorescence Imaging
The Use of In Vitro and In Vivo Single-Molecule Assays
The Use of Fluorescence Imaging Techniques to Monitor the Dynamics of the Cytoskeleton
9.3 Microtubules
Structure and Composition
Microtubule-Associated Proteins
Microtubules as Structural Supports and Organizers
Microtubules as Agents of Intracellular Motility
Motor Proteins that Traverse the Microtubular Cytoskeleton
Microtubule-Organizing Centers (MTOCs)
The Dynamic Properties of Microtubules
Cilia and Flagella: Structure and Function
THE HUMAN PERSPECTIVE: The Role of Cilia in Development and Disease
9.4 Intermediate Filaments
Intermediate Filament Assembly and Disassembly
Types and Functions of Intermediate Filaments
9.5 Microfilaments
Microfilament Assembly and Disassembly
Myosin: The Molecular Motor of Actin Filaments
9.6 Muscle Contractility
The Sliding Filament Model of Muscle Contraction
9.7 Nonmuscle Motility
Actin-Binding Proteins
Examples of Nonmuscle Motility and Contractility
Chapter 10: The Nature of the Gene and the Genome
10.1 The Concept of a Gene as a Unit of Inheritance
10.2 Chromosomes: The Physical Carriers of the Genes
The Discovery of Chromosomes
Chromosomes as the Carriers of Genetic Information
Genetic Analysis in Drosophila
Crossing Over and Recombination
Mutagenesis and Giant Chromosomes
10.3 The Chemical Nature of the Gene
The Structure of DNA
The Watson-Crick Proposal
DNA Supercoiling
10.4 The Structure of the Genome
The Complexity of the Genome
THE HUMAN PERSPECTIVE: Diseases that Result from Expansion of Trinucleotide Repeats
10.5 The Stability of the Genome
Whole-Genome Duplication (Polyploidization)
Duplication and Modification of DNA Sequences
"Jumping Genes" and the Dynamic Nature of the Genome
10.6 Sequencing Genomes: The Footprints of Biological Evolution
Comparative Genomics: "If It's Conserved, It Must Be Important"
The Genetic Basis of "Being Human"
Genetic Variation Within the Human Species Population
THE HUMAN PERSPECTIVE: Application of Genomic Analyses to Medicine
EXPERIMENTAL PATHWAYS: The Chemical Nature of the Gene
Chapter 11: Gene Expression: From Transcription to Translation
11.1 The Relationship between Genes, Proteins, and RNAs
An Overview of the Flow of Information through the Cell
11.2 An Overview of Transcription in Both Prokaryotic and Eukaryotic Cells
Transcription in Bacteria
Transcription and RNA Processing in Eukaryotic Cells
11.3 Synthesis and Processing of Eukaryotic Ribosomal and Transfer RNAs
Synthesizing the rRNA Precursor
Processing the rRNA Precursor
Synthesis and Processing of the 5S rRNA
Transfer RNAs
11.4 Synthesis and Processing of Eukaryotic Messenger RNAs
The Machinery for mRNA Transcription
Split Genes: An Unexpected Finding
The Processing of Eukaryotic Messenger RNAs
Evolutionary Implications of Split Genes and RNA Splicing
Creating New Ribozymes in the Laboratory
11.5 Small Regulatory RNAs and RNA Silencing Pathways
THE HUMAN PERSPECTIVE: Clinical Applications of RNA Interference
MicroRNAs: Small RNAs that Regulate Gene Expression
piRNAs: A Class of Small RNAs that Function in Germ Cells
Other Noncoding RNAs
11.6 Encoding Genetic Information
The Properties of the Genetic Code
11.7 Decoding the Codons: The Role of Transfer RNAs
The Structure of tRNAs
11.8 Translating Genetic Information
Initiation
Elongation
Termination
mRNA Surveillance and Quality Control
Polyribosomes
EXPERIMENTAL PATHWAYS: The Role of RNA as a Catalyst
Chapter 12: Control of Gene Expression
12.1 Control of Gene Expression in Bacteria
Organization of Bacterial Genomes
The Bacterial Operon
Riboswitches
12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus
The Nuclear Envelope
Chromosomes and Chromatin
THE HUMAN PERSPECTIVE: Chromosomal Aberrations and Human Disorders
Epigenetics: There's More to Inheritance than DNA
The Nucleus as an Organized Organelle
12.3 An Overview of Gene Regulation in Eukaryotes
12.4 Transcriptional Control
The Role of Transcription Factors in Regulating Gene Expression
The Structure of Transcription Factors
DNA Sites Involved in Regulating Transcription
Transcriptional Activation: The Role of Enhancers, Promoters, and Coactivators
Transcriptional Repression
12.5 RNA Processing Control
12.6 Translational Control
Initiation of Translation
Cytoplasmic Localization of mRNAs
The Control of mRNA Stability
The Role of MicroRNAs in Translational Control
12.7 Posttranslational Control: Determining Protein Stability
Chapter 13: DNA Replication and Repair
13.1 DNA Replication
Semiconservative Replication
Replication in Bacterial Cells
The Structure and Functions of DNA Polymerases
Replication in Eukaryotic Cells
13.2 DNA Repair
Nucleotide Excision Repair
Base Excision Repair
Mismatch Repair
Double-Strand Breakage Repair
13.3 Between Replication and Repair
THE HUMAN PERSPECTIVE: The Consequences of DNA Repair Deficiencies
Chapter 14: Cellular Reproduction
14.1 The Cell Cycle
Cell Cycles in Vivo
Control of the Cell Cycle
14.2 M Phase: Mitosis and Cytokinesis
Prophase
Prometaphase
Metaphase
Anaphase
Telophase
Motor Proteins Required for Mitotic Movements
Cytokinesis
14.3 Meiosis
The Stages of Meiosis
THE HUMAN PERSPECTIVE: Meiotic Nondisjunction and Its Consequences
Genetic Recombination During Meiosis
EXPERIMENTAL PATHWAYS: The Discovery and Characterization of MPF
Chapter 15: Cell Signaling and Signal Transduction: Communication Between Cells
15.1 The Basic Elements of Cell Signaling Systems
15.2 A Survey of Extracellular Messengers and Their Receptors
15.3 G Protein-Coupled Receptors and Their Second Messengers
Signal Transduction by G Protein-Coupled Receptors
THE HUMAN PERSPECTIVE: Disorders Associated with G Protein-Coupled Receptors
Second Messengers
The Specificity of G Protein-Coupled Responses
Regulation of Blood Glucose Levels
The Role of GPCRs in Sensory Perception
15.4 Protein-Tyrosine Phosphorylation as a Mechanism for Signal Transduction
The Ras-MAP Kinase Pathway
Signaling by the Insulin Receptor
THE HUMAN PERSPECTIVE: Signaling Pathways and Human Longevity
Signaling Pathways in Plants
15.5 The Role of Calcium as an Intracellular Messenger
Regulating Calcium Concentrations in Plant Cells
15.6 Convergence, Divergence, and Cross-Talk Among Different Signaling Pathways
Examples of Convergence, Divergence, and Cross-Talk Among Signaling Pathways
15.7 The Role of NO as an Intercellular Messenger
15.8 Apoptosis (Programmed Cell Death)
The Extrinsic Pathway of Apoptosis
The Intrinsic Pathway of Apoptosis
Chapter 16: Cancer
16.1 Basic Properties of a Cancer Cell
16.2 The Causes of Cancer
16.3 The Genetics of Cancer
Tumor-Suppressor Genes and Oncogenes: Brakes and Accelerators
The Cancer Genome
Gene-Expression Analysis
16.4 New Strategies for Combating Cancer
Immunotherapy
Inhibiting the Activity of Cancer-Promoting Proteins
Inhibiting the Formation of New Blood Vessels (Angiogenesis)
EXPERIMENTAL PATHWAYS: The Discovery of Oncogenes
Chapter 17: The Immune Response
17.1 An Overview of the Immune Response
Innate Immune Responses
Adaptive Immune Responses
17.2 The Clonal Selection Theory as It Applies to B Cells
Vaccination
17.3 T Lymphocytes: Activation and Mechanism of Action
17.4 Selected Topics on the Cellular and Molecular Basis of Immunity
The Modular Structure of Antibodies
DNA Rearrangements that Produce Genes Encoding B- and T-Cell Antigen Receptors
Membrane-Bound Antigen Receptor Complexes
The Major Histocompatibility Complex
Distinguishing Self from Nonself
Lymphocytes Are Activated by Cell-Surface Signals
Signal Transduction Pathways in Lymphocyte Activation
THE HUMAN PERSPECTIVE: Autoimmune Diseases
• EXPERIMENTAL PATHWAYS: The Role of the Major Histocompatibility Complex in Antigen Presentation
Chapter 18: Techniques in Cell and Molecular Biology
18.1 The Light Microscope
Resolution
Visibility
Preparation of Specimens for Bright-Field Light Microscopy
Phase-Contrast Microscopy
Fluorescence Microscopy (and Related Fluorescence-Based Techniques)
Video Microscopy and Image Processing
Laser Scanning Confocal Microscopy
Super-Resolution Fluorescence Microscopy
18.2 Transmission Electron Microscopy
Specimen Preparation for Electron Microscopy
18.3 Scanning Electron and Atomic Force Microscopy
Atomic Force Microscopy
18.4 The Use of Radioisotopes
18.5 Cell Culture
18.6 The Fractionation of a Cell's Contents by Differential Centrifugation
18.7 Isolation, Purification, and Fractionation of Proteins
Selective Precipitation
Liquid Column Chromatography
Polyacrylamide Gel Electrophoresis
Protein Measurement and Analysis
18.8 Determining the Structure of Proteins and Multisubunit Complexes
18.9 Fractionation of Nucleic Acids
Separation of DNAs by Gel Electrophoresis
Separation of Nucleic Acids by Ultracentrifugation
18.10 Nucleic Acid Hybridization
18.11 Chemical Synthesis of DNA
18.12 Recombinant DNA Technology
Restriction Endonucleases
Formation of Recombinant DNAs
DNA Cloning
18.13 Enzymatic Amplification of DNA by PCR
Applications of PCR
18.14 DNA Sequencing
18.15 DNA Libraries
Genomic Libraries
cDNA Libraries
18.16 DNA Transfer into Eukaryotic Cells and Mammalian Embryos
18.17 Determining Eukaryotic Gene Function by Gene Elimination or Silencing
In Vitro Mutagenesis
Knockout Mice
RNA Interference
18.18 The Use of Antibodies
Glossary
Additional Readings
Index

Recommend Papers

Cell and Molecular Biology: Concepts and Experiments [6 ed.] 0470483377, 9780470483374

Karp continues to help biologists make important connections between key concepts and experimentation. The sixth edition

399 96 36MB Read more

Cell and Molecular Biology: Concepts and Experiments [7 ed.] 1118206738, 9781118206737

This Seventh Edition connects experimental material to key concepts of Cell Biology. The text offers streamlined informa

466 59 39MB Read more

Molecular Cell Biology. Glossary and index [5 ed.]

With its acclaimed author team, cutting-edge content, emphasis on medical relevance, and coverage based on landmark expe

703 88 1MB Read more

Morphology Methods: Cell and Molecular Biology Techniques 0896039552, 9780896039551

712 19 8MB Read more

Applied Cell and Molecular Biology for Engineers 0071509526

885 114 5MB Read more

Molecular and Cell Biology For Dummies 9780470531044, 9780470430668

763 96 4MB Read more

Molecular Cell Biology, Ninth Edition 9781319383602

20,089 11,331 445MB Read more

Molecular Cell Biology [Fifth Edition] 0716743663, 9780716743668

Molecular Cell Biology stands out from its peers in this course in that it provides a clear introduction to the techniqu

1,903 149 32MB Read more

Biochemistry, Cell and Molecular Biology Test Practice Book

1,171 146 3MB Read more

Molecular Cell Biology [5Rev Ed] 9780716700692, 0716700697

734 87 50MB Read more

Cell and Molecular Biology. Concepts and Experiments [7 ed.]

Author / Uploaded
Karp G.

Commentary
eBook

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Cell and Molecular Biology Concepts and experiments Gerald Karp

7th Edition

Nobel Prizes Awarded for Research in Cell and Molecular Biology Since 1958 Year

Recipient*

Prize

Area of Research

2012

John B. Gurdon Shinya Yamanaka Brian K. Kobilka Robert J. Lefkowitz Bruce A. Beutler Jules A. Hoffmann Ralph M. Steinman

M & P** Chemistry

Animal cloning, nuclear reprogramming Cell reprogramming G protein-coupled receptors

M&P

Innate immunity

700

Dendritic cells and Adaptive immunity Ribosome structure and function

707

2011

2009

2008

2007

2006

2004

2003 2002

2001

2000

1999 1998

Venkatraman Ramakrishnan Thomas A. Steitz Ada E. Yonath Eliazbeth H. Blackburn Carol W. Greider Jack W. Szostak Francoise Barré-Sinoussi Luc Montagnier Harald zur Hausen Martin Chalfie Osamu Shimomura Roger Tsien Mario R. Capecchi Martin J. Evans Oliver Smithies Andrew Z. Fire Craig C. Mello Roger D. Kornberg Richard Axel Linda B. Buck Aaron Ciechanover Avram Hershko Irwin Rose Peter Agre Roderick MacKinnon Sydney Brenner John Sulston H. Robert Horvitz John B. Fenn Koichi Tanaka Kurt Wüthrich Leland H. Hartwell Tim Hunt Paul Nurse Arvid Carlsson Paul Greengard Eric Kandel Günter Blobel Robert Furchgott Louis Ignarro Ferid Murad

Chemistry

Pages in Text 513 22, 519 621

479

M&P

Telomeres and telomerase

M&P

Discovery of HIV

Chemistry

Role of HPV in cancer Discovery and development of GFP

668 273, 737

M&P

Development of techniques for knockout mice

778

M&P

RNA Interference

455, 780

Chemistry M&P

Transcription in eukaryotes Olfactory receptors

433, 494 634

Chemistry

Ubiquitin and proteasomes

541

Chemistry

Structure of membrane channels Introduction of C. elegans as a model organism Apoptosis in C. elegans Electrospray ionization in MS MALDI in MS NMR analysis of proteins Control of the cell cycle

M&P

Chemistry

M&P

505

24

150, 152 18 657 758 758 57 576, 611

M&P

Synaptic transmission and signal transduction

168 617

M&P M&P

Protein trafficking NO as intercellular messenger

281 655

Year

Recipient*

Prize

Area of Research

1997

Jens C. Skou Paul Boyer John Walker Stanley B. Prusiner Rolf M. Zinkernagel Peter C. Doherty Edward B. Lewis Christiane Nüsslein-Volhard Eric Wieschaus Alfred Gilman Martin Rodbell Kary Mullis Michael Smith Richard J. Roberts Phillip A. Sharp Edmond Fischer Edwin Krebs Erwin Neher Bert Sakmann Joseph E. Murray E. Donnall Thomas J. Michael Bishop Harold Varmus Thomas R. Cech Sidney Altman Johann Deisenhofer Robert Huber Hartmut Michel Susumu Tonegawa

Chemistry

Na⫹/K⫹-ATPase Mechanism of ATP synthesis

157 201

M&P M&P

Protein nature of prions Recognition of virus-infected cells by the immune system Genetic control of embryonic development

66 727

1996 1995

1994 1993

1992 1991 1990 1989

1988

1987 1986 1985 1984

1983 1982 1980

1978

1976 1975

Rita Levi-Montalcini Stanley Cohen Michael S. Brown Joseph L. Goldstein Georges Köhler Cesar Milstein Niels K. Jerne Barbara McClintock Aaron Klug Paul Berg Walter Gilbert Frederick Sanger Baruj Bennacerraf Jean Dausset George D. Snell Werner Arber Daniel Nathans Hamilton O. Smith Peter Mitchell D. Carleton Gajdusek David Baltimore Renato Dulbecco Howasrd M. Temin

M&P

M&P Chemistry M&P M&P M&P M&P M&P Chemistry

Structure and function of GTP-binding (G) proteins Polymerase chain reaction (PCR) Site-directed mutagenesis (SDM) Intervening sequences Alteration of enzyme activity by phosphorylation/dephosphorylation Measurement of ion flux by patch-clamp recording Organ and cell transplantation in human disease Cellular genes capable of causing malignant transformation Ability of RNA to catalyze reactions

Pages in Text

EP12

624 769 778 444 115, 627 152 716, 20 695 477

Chemistry

Bacterial photosynthetic reaction center

218

M&P

DNA rearrangements responsible for antibody diversity Factors that affect nerve outgrowth

713

M&P M&P

379

Regulation of cholesterol metabolism and endocytosis Monoclonal antibodies

319

Antibody formation Mobile elements in the genome Structure of nucleic acid-protein complexes Recombinant DNA technology DNA sequencing technology

704 408 79 764 771

M&P

Major histocompatibility complex

716

M&P

Restriction endonuclease technology

764

Chemistry

Chemiosmotic mechanism of oxidative phosphorylation Prion-based diseases Reverse transcriptase and tumor virus activity

187

M&P

M&P Chemistry Chemistry

M&P M&P

782

66 694

Year

Recipient*

Prize

Area of Research

1974

Albert Claude Christian de Duve George E. Palade Gerald Edelman Rodney R. Porter Christian B. Anfinsen

M&P

Structure and function of internal components of cells

275

M&P

Immunoglobulin structure

711

Chemistry

1972

1971

Earl W. Sutherland

M&P

1970

Bernard Katz Ulf von Euler Luis F. Leloir

M&P

Max Delbrück Alfred D. Hershey Salvador E. Luria H. Gobind Khorana Marshall W. Nirenberg Robert W. Holley Peyton Rous Francois Jacob Andre M. Lwoff Jacques L. Monod Dorothy C. Hodgkin John C. Eccles Alan L. Hodgkin Andrew F. Huxley Francis H. C. Crick James D. Watson Maurice H. F. Wilkins John C. Kendrew Max F. Perutz Melvin Calvin

M&P

Relationship between primary and tertiary structure of proteins Mechanism of hormone action and cyclic AMP Nerve impulse propagation and transmission Role of sugar nucleotides in carbohydrate synthesis Genetic structure of viruses

M&P

Genetic code

M&P M&P

Transfer RNA structure Tumor viruses Bacterial operons and messenger RNA

1969

1968

1966 1965

1964 1963

1962

1961 1960 1959 1958

F. MacFarlane Burnet Peter B. Medawar Arthur Kornberg Severo Ochoa George W. Beadle Joshua Lederberg Edward L. Tatum Frederick Sanger

Chemistry

Pages in Text

63 627 165 285 23, 422

462 465 694 484, 428

Chemistry M&P

X-ray structure of complex biological molecules Ionic basis of nerve membrane potentials

758 164

M&P

Three-dimensional structure of DNA

393

Chemistry

M&P

Three-dimensional structure of globular proteins Biochemistry of CO2 assimilation during photosynthesis Clonal selection theory of antibody formation Synthesis of DNA and RNA

M&P

Gene expression

Chemistry

Primary structure of proteins

Chemistry M&P

*In a few cases, corecipients whose research was in an area outside of cell and molecular biology have been omitted from this list. **Medicine and Physiology

58 226 704 550, 463 427

55

This page is intentionally left blank

7ILEY0,53 IS A RESEARCH BASED ONLINE ENVIRONMENT FOR EFFECTIVE TEACHING AND LEARNING WileyPLUS builds students’ confidence because it takes the guesswork out of studying by providing students with a clear roadmap: s WHAT TO DO s HOW TO DO IT s IF THEY DID IT RIGHT It offers interactive resources along with a complete digital textbook that help students learn more. With WileyPLUS, students take more initiative so you’ll have greater impact on their achievement in the classroom and beyond.

.OW AVAILABLE FOR

For more information, visit www.wileyplus.com

ALL THE HELP, RESOURCES, AND PERSONAL SUPPORT YOU AND YOUR STUDENTS NEED! www.wileyplus.com/resources

3TUDENT Partner 0ROGRAM 2-Minute Tutorials and all of the resources you and your students need to get started

Student support from an experienced student user

Collaborate with your colleagues, find a mentor, attend virtual and live events, and view resources www.WhereFacultyConnect.com

Quick Start © Courtney Keating/ iStockphoto

Pre-loaded, ready-to-use assignments and presentations created by subject matter experts

Technical Support 24/7 FAQs, online chat, and phone support www.wileyplus.com/support

Your WileyPLUS Account Manager, providing personal training and support

7 th Edition

Cell and Molecular Biology Concepts and Experiments

Gerald Karp Chapter 12 was revised in collaboration with

James G. Patton DEPARTMENT OF BIOLOGICAL SCIENCES VANDERBILT UNIVERSITY

VICE PRESIDENT & PUBLISHER ACQUISITIONS EDITOR MARKETING MANAGER ASSOCIATE DIRECTOR OF MARKETING CONTENT MANAGER ASSISTANT EDITOR ASSOCIATE CONTENT EDITOR SENIOR PRODUCT DESIGNER PRODUCTION EDITOR DESIGN DIRECTOR SENIOR DESIGNER PHOTO EDITOR PRODUCTION MANAGEMENT SERVICES

Kaye Pace Kevin Witt Clay Stone Amy Scholz Juanita Thompson Lauren Stauber Lauren Morris Bonnie Roth Sandra Dumas Harry Nolan Madelyn Lesure Jennifer Atkins Furino Production

COVER PHOTO CREDIT: Courtesy Fred H. Gage and Kristen Brennand Stethoscope icon repeated throughout text: ©Alan Crawford/istockphoto

This book was typeset in 10.5/12 Adobe Caslon at Aptara and printed and bound by QuadGraphics, Inc. The cover was printed by QuadGraphics, Inc. Founded in 1807, John Wiley & Sons, Inc. has been a valued source of knowledge and understanding for more than 200 years, helping people around the world meet their needs and fulfill their aspirations. Our company is built on a foundation of principles that include responsibility to the communities we serve and where we live and work. In 2008, we launched a Corporate Citizenship Initiative, a global effort to address the environmental, social, economic, and ethical challenges we face in our business. Among the issues we are addressing are carbon impact, paper specifications and procurement, ethical conduct within our business and among our vendors, and community and charitable support. For more information, please visit our website: www.wiley.com/go/citizenship. The paper in this book was manufactured by a mill whose forest management programs include sustained yieldharvesting of its timberlands. Sustained yield harvesting principles ensure that the number of trees cut each year does not exceed the amount of new growth. This book is printed on acid-free paper. Copyright © 2013, 2010, 2008, 2005, 2002 John Wiley and Sons, Inc.. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, (201) 748-6011, fax (201) 748-6008. Evaluation copies are provided to qualified academics and professionals for review purposes only, for use in their courses during the next academic year. These copies are licensed and may not be sold or transferred to a third party. Upon completion of the review period, please return the evaluation copy to Wiley. Return instructions and a free of charge return shipping label are available at www.wiley.com/go/returnlabel. Outside of the United States, please contact your local representative. ISBN 13 ISBN 13

978-1118-20673-7 978-1118-30179-1

Printed in the United States of America. 10 9 8 7 6 5 4 3 2 1

v

To Patsy and Jenny

About the Author Gerald C. Karp received a bachelor’s degree from UCLA and a Ph.D. from the University of Washington. He conducted postdoctoral research at the University of Colorado Medical Center before joining the faculty at the University of Florida. Gerry is the author of numerous research articles on the cell and molecular biology of early development. His interests have included the synthesis of RNA in early embryos, the movement of mesenchyme cells during gastrulation, and

cell determination in slime molds. For 13 years, he taught courses in molecular, cellular, and developmental biology at the University of Florida. During this period, Gerry coauthored a text in developmental biology with N. John Berrill and authored a text in cell and molecular biology. Finding it impossible to carry on life as both full-time professor and author, Gerry gave up his faculty position to concentrate on the revision of this textbook every three years.

About the Cover The micrograph on the cover of the book shows human nerve cells that have developed (differentiated) in a culture dish from undifferentiated stem cells. The stem cells used in this experiment were pluripotent cells, that is, they were capable of developing into any one of the many different types of cells that make up the human body. In this experiment, the stem cells were driven to differentiate specifically into nerve cells by adding a number of neuron-specific factors to the medium in which the stem cells were growing. Normally, human pluripotent stem cells are only found within the very early stages of a human embryo, but the stem cells used in this experiment were not derived from an embryo but instead were generated experimentally. They were induced from a type of connective tissue cell called a fibroblast by forcing the fibroblast to express a number of genes that it would not normally express. Forcing adult fibroblasts (or other types of adult cells) to express these

“stem cell genes” causes them to lose their differentiated properties, such as the production of collagen, and become what has been termed induced pluripotent stem cells (or iPS cells). As discussed on page 22, iPS cells may one day play a key role in replacing the cells of diseased tissues and organs. The fibroblasts used in this experiment were not derived from a healthy person but from a person who had been diagnosed with schizophrenia. We don’t understand the molecular basis of schizophrenia, but it is hoped that studying the differentiation of nerve cells from persons with this disease will provide important insights into the underlying basis of the disease. Such cells may also serve as a useful tool to screen potential drugs for their effectiveness in treating the disease being studied. Because of these features, such iPS cells have been referred to as “patients in a Petri dish.” (Courtesy Fred H. Gage and Kristen Brennand.)

About the Cover

vi

Preface to the Seventh Edition Before I began work on the first edition of this text, I drew up a number of basic guidelines regarding the type of book I planned to write. I wanted a text suited for an introductory course in cell and molecular biology that ran either a single semester or 1–2 quarters. I set out to draft a text of about 800 pages that would not overwhelm or discourage students at this level. ● I wanted a text that elaborated on fundamental concepts, such as the relationship between molecular structure and function, the dynamic character of cellular organelles, the use of chemical energy in running cellular activities and ensuring accurate macromolecular biosynthesis, the observed unity and diversity at the macromolecular and cellular levels, and the mechanisms that regulate cellular activities. ● I wanted a text that was grounded in the experimental approach. Cell and molecular biology is an experimental science and, like most instructors, I believe students should gain some knowledge of how we know what we know. With this in mind, I decided to approach the experimental nature of the subject in two ways. As I wrote each chapter, I included enough experimental evidence to justify many of the conclusions that were being made. Along the way, I described the salient features of key experimental approaches and research methodologies. Chapters 8 and 9, for example, contain introductory sections on techniques that have proven most important in the analysis of cytomembranes and the cytoskeleton, respectively. I included brief discussions of selected experiments of major importance in the body of the chapters to reinforce the experimental basis of our knowledge. I placed the more detailed aspects of methodologies in a final “techniques chapter” because (1) I did not want to interrupt the flow of discussion of a subject with a large tangential section on technology and (2) I realized that different instructors prefer to discuss a particular technology in connection with different subjects. For students and instructors who wanted to explore the experimental approach in greater depth, I included an Experimental Pathways at the end of most chapters. Each of these narratives describes some of the key experimental findings that have led to our current understanding of a particular subject that is relevant to the chapter at hand. Because the scope of the narrative is limited, the design of the experiments can be considered in some detail. The figures and tables provided in these sections are often those that appeared in the original research article, which provides the reader an opportunity to examine original data and to realize that its analysis is not beyond their means. The Experimental Pathways also illustrate the stepwise nature of scientific discovery, showing how the result of one study raises questions that provide the basis for subsequent studies. ● I wanted a text that was interesting and readable. To make the text more relevant to undergraduate readers, particularly premedical students, I included The Human Perspective. These sections illustrate that virtually all human disorders can

Preface to the Seventh Edition

●

be traced to disruption of activities at the cellular and molecular level. Furthermore, they reveal the importance of basic research as the pathway to understanding and eventually treating most disorders. In Chapter 11, for example, The Human Perspective describes how small synthetic siRNAs may prove to be an important new tool in the treatment of cancer and viral diseases, including AIDS. In this same chapter, the reader will learn how the action of such RNAs were first revealed in studies on plants and nematodes. It becomes evident that one can never predict the practical importance of basic research in cell and molecular biology. I have also tried to include relevant information about human biology and clinical applications throughout the body of the text. ● I wanted a high-quality illustration program that helped students visualize complex cellular and molecular processes. To meet this goal, many of the illustrations have been “steppedout” so that information can be more easily broken down into manageable parts. Events occurring at each step are described in the figure legend and/or in the corresponding text. I also sought to include a large number of micrographs to enable students to see actual representations of most subjects being discussed. Included among the images are many fluorescence micrographs that illustrate either the dynamic properties of cells or provide a means to localize a specific protein or nucleic acid sequence. Wherever possible, I have tried to pair line art drawings with micrographs to help students compare idealized and actual versions of a structure. The most important changes in the seventh edition can be delineated as follows: Each of the illustrations has been carefully scrutinized and a large number of drawings have been modified with the goal of achieving greater consistency and quality. Particular attention has been paid to the continuity of color and rendering style for each structure and element, as they are represented within each figure, and throughout the book. ● The illustration program for the seventh edition includes a new feature called Figure in Focus. The premise of this feature is to highlight one of the chapter’s key topics in a visually interesting way. Focusing attention on these figures, through the use of line art, 3D molecular models, and micrographs, provides a clear visual explanation of one of the chapter’s core concepts. ● The body of information in cell and molecular biology is continually changing, which provides much of the excitement we all feel about our selected field. Even though only three years have passed since the publication of the sixth edition, nearly every discussion in the text has been modified to a greater or lesser degree. This has been done without allowing the chapters to increase significantly in length. ● Altogether, the seventh edition contains more than 100 new micrographs and computer-derived images, all of which were provided by the original source. ●

vii

WileyPLUS This online teaching and learning environment integrates the entire digital textbook with the most effective instructor and student resources to fit every learning style.

●

●

With WileyPLUS: ●

●

Students achieve concept mastery in a rich, structured environment that’s available 24/7 Instructors personalize and manage their course more effectively with assessment, assignments, grade tracking, and more.

WileyPLUS can complement your current textbook or replace the printed text altogether.

Clinical and Experimental Focus! ●

For Students

Personalize the learning experience Different learning styles, different levels of proficiency, different levels of preparation—each of your students is unique. WileyPLUS empowers them to take advantage of their individual strengths: ● Students receive timely access to resources that address their demonstrated needs, and get immediate feedback and remediation when needed. ● Integrated, multi-media resources include ● Animations of key concepts based on the illustrations of the text. ● Video library of clips from leading journals which can now be assigned with accompanying questions by Anne Hemsley, Antelope Valley Community College. ● WileyPLUS includes many opportunities for selfassessment linked to the relevant portions of the text. Students can take control of their own learning and practice until they master the material.

With WileyPLUS you can identify those students who are falling behind and intervene accordingly, without having to wait for them to come to office hours. WileyPLUS simplifies and automates such tasks as student performance assessment, making assignments, scoring student work, keeping grades, and more. ● Pre and Post Lecture Assessment by Joel Piperberg, Millersville University. ● Test Bank, Instructor’s Manual, and “Clicker” Questions by Joel Piperberg, Millersville University. ● NEW Lecture PowerPoint Presentations by Edmund B. Rucker, University of Kentucky.

●

●

●

Clinical Case Studies and accompanying questions by Claire Walczak, Indiana University & Anthony Contento, SUNY Oswego. Clinical Connections Questions by Sarah VanVickleChavez, Washington University in St. Louis. Experimental Pathways Questions by Joel Piperberg, Millersville University. NEW Figure in Focus feature by Anthony Contento, SUNY Oswego, New podcasts & assessment questions accompany selected figures, highlighting important concepts & processes.

Book Companion Site (www.wiley.com/college/karp) For the Student ● ● ● ●

Quizzes for student self-testing. Biology NewsFinder; Flash Cards; and Animations. Answers to the end-of chapter Analytic Questions. Additional reading resources provide students with an extensive list of additional useful sources of information. Experimental Pathways for Chapters 5, 6, 7, 9, 12, 13, and 15.

For Instructors

●

Personalize the teaching experience WileyPLUS empowers you with the tools and resources you need to make your teaching even more effective: ● You can customize your classroom presentation with a wealth of resources and functionality from PowerPoint slides to a database of rich visuals. You can even add your own materials to your WileyPLUS course.

For the Instructor ●

●

Biology Visual Library; all images in jpg and PowerPoint formats. Instructor’s Manual; Test Bank; Clicker Questions; Lecture PowerPoint Presentations.

Instructor Resources are password protected.

Acknowledgments support over two editions is not forgotten. Ably taking her place in this edition was Lauren Stauber, who served as the assistant editor on the project with the guidance of Kevin Witt. Thanks also go to Lauren Morris for directing the development of the diverse supplements that are offered with this text. I am particularly indebted to the Wiley production

Acknowledgments

I am particularly grateful to James Patton of Vanderbilt University for providing a revised version of Chapter 12 on The Control of Gene Expression, which formed the basis of the current chapter in this text. There are many people at John Wiley & Sons who have made important contributions to this text. I continue to be grateful to Geraldine Osnato whose work and

viii

staff, who are simply the best. Jeanine Furino, of Furino Production, served as the central nervous system, coordinating the information arriving from compositors, copyeditors, proofreaders, illustrators, photo editors, designers, and dummiers, as well as the constant barrage of text changes ordered by the author. Always calm, organized, and meticulous, she made sure everything was done correctly. Hilary Newman and Jennifer Atkins were responsible for obtaining all of the many new images that are found in this edition. Hilary and Jennifer are skillful and perseverant, and I have utmost confidence in their ability to obtain any image requested. The book has a complex illustration program and Kathy Naylor did a superb job in coordinating all of the many facets required MICHAEL JONZ

SHIVENDRA V. SAHI

University of Ottawa

Western Kentucky University

California State Polytechnic University, Pomona

ROLAND KAUNAS

INDER M. SAXENA

Texas A&M University

University of Texas, Austin

RAVI ALLADA

TOM KELLER

TIM SCHUH

Northwestern University

Florida State University

St. Cloud State University

KARL J. AUFDERHEIDE

REBECCA KELLUM

ERIC SHELDEN

Texas A&M University

University of Kentucky

Washington State University

KENNETH J. BALAZOVICH

GREG M. KELLY

ROGER D. SLOBODA

University of Michigan

University of Western Ontario

Dartmouth College

ALLAN BLAKE

KIM KIRBY

ANN STURTEVANT

Seton Hall University

University of Guelph

University of Michigan-Flint

MARTIN BOOTMAN

CLAIRE M. LEONARD

WILLIAM TERZAGHI

Babraham Institute

William Paterson University

Wilkes University

DAVID BOURGAIZE

FAITH LIEBL

PAUL TWIGG

Whittier College

Southern Illinois University, Edwardsville

University of Nebraska-Kearney

KENT D. CHAPMAN

JON LOWRANCE

CLAIRE E. WALCZAK

University of North Texas

Lipscomb University

Indiana University

KATE COOPER

CHARLES MALLERY

PAUL E. WANDA

Loras College

University of Miami

Southern Illinois University, Edwardsville

LINDA DEVEAUX

MICHAEL A. MCALEAR

ANDREW WOOD

Idaho State University

Wesleyan University

Southern Illinois University

RICHARD E. DEARBORN

JOANN MEERSCHAERT

DANIELA ZARNESCU

Albany College of Pharmacy

St. Cloud State University

University of Arizona

BENJAMIN GLICK

JOHN MENNINGER

JIANZHI ZHANG

The University of Chicago

University of Iowa

University of Michigan

REGINALD HALABY

KIRSTEN MONSEN

Montclair State University

Montclair State University

MICHAEL HAMPSEY

ALAN NIGHORN

University of Medicine and Dentistry of New Jersey

University of Arizona

Thanks are still owed to the following reviewers of the previous several editions: LINDA AMOS MRC Laboratory of Molecular Biology

MICHAEL HARRINGTON

ROBERT M. NISSEN California State University, Los Angeles

GERALD T. BABCOCK

University of Alberta

Michigan State University

MARCIA HARRISON

VERONICA C. NWOSU

WILLIAM E. BALCH

Marshall University

North Carolina Central University

The Scripps Research Institute

R. SCOTT HAWLEY

GREG ODORIZZI

JAMES BARBER

American Cancer Society Research Professor

University of Colorado, Boulder

Imperial College of Science—Wolfson Laboratories

REBECCA HEALD

JAMES G. PATTON

JOHN D. BELL

University of California, Berkeley

Vanderbilt University

Brigham Young University

MARK HENS

CHARLES PUTNAM

WENDY A. BICKMORE

University of North Carolina, Greensboro

University of Arizona

Medical Research Council, United Kingdom

JEN-CHIH HSIEH

DAVID REISMAN

ASHOK BIDWAI

State University of New York at Stony Brook

University of South Carolina

West Virginia University

Seventh edition reviewers: STEVE ALAS

Acknowledgments

to guide it to completion. The elegant design of the book and cover is due to the efforts of Madelyn Lesure, whose talents are evident. A special thanks is owed Laura Ierardi who skillfully laid out the pages for each chapter. I am especially thankful to the many biologists who have contributed micrographs for use in this book; more than any other element, these images bring the study of cell biology to life on the printed page. Finally, I would like to apologize in advance for any errors that may occur in the text, and express my heartfelt embarrassment. I am grateful for the constructive criticism and sound advice from the following reviewers of the most recent editions:

ix DANIEL BRANTON

GREGORY D. D. HURST

RANDY SCHEKMAN

Harvard University

University College London

University of California—Berkeley

THOMAS R. BREEN

KEN JACOBSON

SANDRA SCHMID

Southern Illinois University

University of North Carolina

The Scripps Research Institute

SHARON K. BULLOCK

MARIE JANICKE

TRINA SCHROER

Virginia Commonwealth University

University at Buffalo—SUNY

Johns Hopkins University

RODERICK A. CAPALDI

HAIG H. KAZAZIAN, JR.

DAVID SCHULTZ

University of Oregon

University of Pennsylvania

University of Louisville

GORDON G. CARMICHAEL

LAURA R. KELLER

ROD SCOTT

University of Connecticut Health Center

Florida State University

Wheaton College

RATNA CHAKRABARTI

NEMAT O. KEYHANI

KATIE SHANNON

University of Central Florida

University of Florida

University of North Carolina—Chapel Hill

K. H. ANDY CHOO

NANCY KLECKNER

JOEL B. SHEFFIELD

Royal Children’s Hospitals— The Murdoch Institute

Harvard University

Temple University

WERNER KÜHLBRANDT

DENNIS SHEVLIN

DENNIS O. CLEGG

Max-Planck-Institut für Biophysik

College of New Jersey

University of California—Santa Barbara

JAMES LAKE

HARRIETT E. SMITH-SOMERVILLE

RONALD H. COOPER

University of California—Los Angeles

University of Alabama

University of California—Los Angeles

ROBERT C. LIDDINGTON

BRUCE STILLMAN

PHILIPPA D. DARBRE

Burnham Institute

Cold Spring Harbor Laboratory

University of Reading

VISHWANATH R. LINGAPPA

ADRIANA STOICA

ROGER W. DAVENPORT

University of California—San Francisco

Georgetown University

University of Maryland

JEANNETTE M. LOUTSCH

COLLEEN TALBOT

SUSAN DESIMONE

Arkansas State University

California State Univerity, San Bernardino

Middlebury College

MARGARET LYNCH

GISELLE THIBAUDEAU

BARRY J. DICKSON

Tufts University

Mississippi State University

Research Institute of Molecular Pathology

ARDYTHE A. MCCRACKEN

JEFFREY L. TRAVIS

DAVID DOE

University of Nevada—Reno

University at Albany—SUNY

Westfield State College

THOMAS MCKNIGHT

NIGEL UNWIN

ROBERT S. DOTSON

Texas A&M University

MRC Laboratory of Molecular Biology

Tulane University

MICHELLE MORITZ

AJIT VARKI

JENNIFER A. DOUDNA

University of California—San Francisco

University of California—San Diego

Yale University

ANDREW NEWMAN

JOSE VAZQUEZ

MICHAEL EDIDIN

Cambridge University

New York University

Johns Hopkins University

JONATHAN NUGENT

JENNIFER WATERS

EVAN E. EICHLER

University of London

Harvard University

University of Washington

MIKE O’DONNELL

CHRIS WATTERS

ARRI EISEN

Rockefeller University

Middlebury College

Emory University

JAMES PATTON

ANDREW WEBBER

ROBERT FILLINGAME

Vanderbilt University

Arizona State University

University of Wisconsin Medical School

HUGH R. B. PELHAM

BEVERLY WENDLAND

ORNA COHEN-FIX

MRC Laboratory of Molecular Biology

Johns Hopkins University

National Institute of Health, Laboratory of Molecular and Cellular Biology

JONATHAN PINES

GARY M. WESSEL

Wellcome/CRC Institute

Brown University

JACEK GAERTIG

DEBRA PIRES

ERIC V. WONG

University of Georgia

REGINALD HALABY Montclair State University

ROBERT HELLING University of Michigan

ARTHUR HORWICH Yale University School of Medicine Roswell Park Cancer Institute

University of Louisville

MITCH PRICE

GARY YELLEN

Pennsylvania State University

Harvard Medical School

DONNA RITCH

MASASUKE YOSHIDA

University of Wisconsin—Green Bay

Tokyo Institute of Technology

JOEL L. ROSENBAUM

ROBERT A. ZIMMERMAN

Yale University

WOLFRAM SAENGER Freie Universitat Berlin

University of Massachusetts

Acknowledgments

JOEL A. HUBERMAN

University of California—Los Angeles

x

To the Student

To the Student At the time I began college, biology would have been at the bottom of a list of potential majors. I enrolled in a physical anthropology course to fulfill the life science requirement by the easiest possible route. During that course, I learned for the first time about chromosomes, mitosis, and genetic recombination, and I became fascinated by the intricate activities that could take place in such a small volume of cellular space. The next semester, I took Introductory Biology and began to seriously consider becoming a cell biologist. I am burdening you with this personal trivia so you will understand why I wrote this book and to warn you of possible repercussions. Even though many years have passed, I still find cell biology the most fascinating subject to explore, and I still love spending the day reading about the latest findings by colleagues in the field. Thus, for me, writing a text on cell biology provides a reason and an opportunity to keep abreast with what is going on throughout the field. My primary goal in writing this text is to help generate an appreciation in students for the activities in which the giant molecules and minuscule structures that inhabit the cellular world of life are engaged. Another goal is to provide the reader with an insight into the types of questions that cell and molecular biologists ask and the experimental approaches they use to seek answers. As you read the text, think like a researcher; consider the evidence that is presented, think of alternate explanations, plan experiments that could lead to new hypotheses. You might begin this approach by looking at one of the many electron micrographs that fill the pages of this text. To take this photograph, you would be sitting in a small, pitchblack room in front of a large metallic instrument whose column rises several meters above your head. You are looking through a pair of binoculars at a vivid, bright green screen. The parts of the cell you are examining appear dark and colorless against the bright green background. They are dark because they’ve been stained with heavy metal atoms that deflect a fraction of the electrons within a beam that is being focused on the viewing screen by large electromagnetic lenses in the wall of the column. The electrons that strike the screen are accelerated through the evacuated space of the column by a force of tens of thousands of volts. One of your hands may be gripping a knob that controls the magnifying power of the lenses. A simple turn of this knob can switch the image in front of your eyes from that of a whole field of cells to a tiny part of a cell, such as a few ribosomes or a small portion of a single membrane. By turning other knobs, you can watch different parts of the specimen glide across the screen, giving you the sensation that you’re driving around inside a cell. Because the study of cell function requires the use of considerable instrumentation, such as the electron microscope just described, the investigator is physically removed from the subject being studied. To a large degree, cells are like tiny black boxes. We have developed many ways to probe the

boxes, but we are always groping in an area that cannot be fully illuminated. A discovery is made or a new technique is developed and a new thin beam of light penetrates the box. With further work, our understanding of the structure or process is broadened, but we are always left with additional questions. We generate more complete and sophisticated constructions, but we can never be sure how closely our views approach reality. In this regard, the study of cell and molecular biology can be compared to the study of an elephant as conducted by six blind men in an old Indian fable. The six travel to a nearby palace to learn about the nature of elephants. When they arrive, each approaches the elephant and begins to touch it. The first blind man touches the side of the elephant and concludes that an elephant is smooth like a wall. The second touches the trunk and decides that an elephant is round like a snake. The other members of the group touch the tusk, leg, ear, and tail of the elephant, and each forms his impression of the animal based on his own limited experiences. Cell biologists are limited in a similar manner as to what they can learn by using a particular technique or experimental approach. Although each new piece of information adds to the preexisting body of knowledge to provide a better concept of the activity being studied, the total picture remains uncertain. Before closing these introductory comments, let me take the liberty of offering the reader some advice: Don’t accept everything you read as being true. There are several reasons for urging such skepticism. Undoubtedly, there are errors in this text that reflect the author’s ignorance or misinterpretation of some aspect of the scientific literature. But, more importantly, we should consider the nature of biological research. Biology is an empirical science; nothing is ever proved. We compile data concerning a particular cell organelle, metabolic reaction, intracellular movement, etc., and draw some type of conclusion. Some conclusions rest on more solid evidence than others. Even if there is a consensus of agreement concerning the “facts” regarding a particular phenomenon, there are often several possible interpretations of the data. Hypotheses are put forth and generally stimulate further research, thereby leading to a reevaluation of the original proposal. Most hypotheses that remain valid undergo a sort of evolution and, when presented in the text, should not be considered wholly correct or incorrect. Cell biology is a rapidly moving field and some of the best hypotheses often generate considerable controversy. Even though this is a textbook where one expects to find material that is well tested, there are many places where new ideas are presented. These ideas are often described as models. I’ve included such models because they convey the current thinking in the field, even if they are speculative. Moreover, they reinforce the idea that cell biologists operate at the frontier of science, a boundary between the unknown and known (or thought to be known). Remain skeptical.

xi

Contents 1 Introduction to the Study of Cell and Molecular Biology 1 1.1 The Discovery of Cells 2 1.2 Basic Properties of Cells 3 Cells Are Highly Complex and Organized 3 Cells Possess a Generic Program and the Means to Use It 5 Cells Are Capable of Producing More of Themselves 5 Cells Acquire and Utilize Energy 5 Cells Carry Out a Variety of Chemical Reactions 6 Cells Engage in Mechanical Activities 6 Cells Are Able to Respond to Stimuli 6 Cells Are Capable of Self-Regulation 6 Cells Evolve 7

1.3 Two Fundamentally Different Classes of Cells 7 Characteristics That Distinguish Prokaryotic and Eukaryotic Cells 8 Types of Prokaryotic Cells 14 Types of Eukaryotic Cells: Cell Specialization 15 The Sizes of Cells and Their Components 17 Synthetic Biology 17 ● TH E H U MAN P E R S P ECTIVE: The Prospect of Cell Replacement Therapy 20

1.4 Viruses 23 Viroids 26 ● E X P E R I M E N TA L PAT H WAYS : The Origin of Eukaryotic Cells 26

2 The Chemical Basis of Life

32

Lipids 47 Proteins 50 ● TH E H U MAN P E R S P ECTIVE: Protein Misfolding Can Have Deadly Consequences 66 Nucleic Acids 77

2.6 The Formation of Complex Macromolecular Structures 79 The Assembly of Tobacco Mosaic Virus Particles and Ribosomal Subunits 79 ● E X P E R I M E NTAL PATH WAYS: Chaperones: Helping Proteins Reach Their Proper Folded State 80

3 Bioenergetics, Enzymes, and Metabolism

86

3.1 Bioenergetics 87 The Laws of Thermodynamics and the Concept of Entropy 87 Free Energy 89

3.2 Enzymes as Biological Catalysts 94 The Properties of Enzymes 95 Overcoming the Activation Energy Barrier 96 The Active Site 97 Mechanisms of Enzyme Catalysis 99 Enzyme Kinetics 102 ● TH E H U MAN P E R S P ECTIVE: The Growing Problem of Antibiotic Resistance 106

3.3 Metabolism 108 An Overview of Metabolism 108 Oxidation and Reduction: A Matter of Electrons 109 The Capture and Utilization of Energy 110 Metabolic Regulation 115

2.1 Covalent Bonds 33 Polar and Nonpolar Molecules 34 Ionizaton 34

2.2 Noncovalent Bonds 34 ● TH E H U MAN P E R S P ECTIVE: Free Radicals as a Cause of Aging 35 Ionic Bonds: Attractions between Charged Atoms 35 Hydrogen Bonds 36 Hydrophobic Interactions and van der Waals Forces 36 The Life-Supporting Properties of Water 37

2.3 Acids, Bases, and Buffers 39 2.4 The Nature of Biological Molecules 40 Functional Groups 41 A Classification of Biological Molecules by Function 41

2.5 Four Types of Biological Molecules 42

4.1 An Overview of Membrane Functions 121 4.2 A Brief History of Studies on Plasma Membrane Structure 123 4.3 The Chemical Composition of Membranes 125 Membrane Lipids 125 The Asymmetry of Membrane Lipids 128 Membrane Carbohydrates 129

4.4 The Structure and Functions of Membrane Proteins 130 Integral Membrane Proteins 130 Studying the Structure and Properties of Integral Membrane Proteins 132

Contents

Carbohydrates 43

4 The Structure and Function of the Plasma Membrane 120

xii Peripheral Membrane Proteins 137 Lipid-Anchored Membrane Proteins 137

4.5 Membrane Lipids and Membrane Fluidity 138 The Importance of Membrane Fluidity 139 Maintaining Membrane Fluidity 139 Lipid Rafts 139

4.6 The Dynamic Nature of the Plasma Membrane 140 The Diffusion of Membrane Proteins after Cell Fusion 141 Restrictions on Protein and Lipid Mobility 142 The Red Blood Cell: An Example of Plasma Membrane Structure 145

4.7 The Movement of Substances Across Cell Membranes 147 The Energetics of Solute Movement 147 Diffusion of Substances through Membranes 149 Facilitated Diffusion 156 Active Transport 157 ● TH E H U MAN P E R S P ECTIVE: Defects in Ion Channels and Transporters as a Cause of Inherited Disease 162

4.8 Membrane Potentials and Nerve Impulses 164 The Resting Potential 164 The Action Potential 165 Propagation of Action Potentials as an Impulse 167 Neurotransmission: Jumping the Synaptic Cleft 168 ● E X P E R I M E N TA L PAT H WAYS : The Acetylcholine Receptor 171

5 Aerobic Respiration and the Mitochondrion 178 5.1 Mitochondrial Structure and Function 179 Mitochondrial Membranes 180 The Mitochondrial Matrix 182

5.2 Oxidative Metabolism in the Mitochondrion 183 The Tricarboxylic Acid (TCA) Cycle 185 The Importance of Reduced Coenzymes in the Formation of ATP 186 ● TH E H U MAN P E R S P ECTIVE: The Role of Anaerobic and Aerobic Metabolism in Exercise 188

5.3 The Role of Mitochondria in the Formation of ATP 189

Contents

Oxidation–Reduction Potentials 189 Electron Transport 190 Types of Electron Carriers 191

Other Roles for the Proton-Motive Force in Addition to ATP Synthesis 205

5.6 Peroxisomes 206 ● TH E H U MAN P E R S P ECTIVE: Diseases that Result from Abnormal Mitochondrial or Peroxisomal Function 207

6 Photosynthesis and the Chloroplast 6.1 Chloroplast Structure and Function 213

6.2 An Overview of Photosynthetic Metabolism 214 6.3 The Absorption of Light 216 Photosynthetic Pigments 216

6.4 Photosynthetic Units and Reaction Centers 218 Oxygen Formation: Coordinating the Action of Two Different Photosynthetic Systems 218 Killing Weeds by Inhibiting Electron Transport 225

6.5 Photophosphorylation 225 Noncyclic Versus Cyclic Photophosphorylation 226

6.6 Carbon Dioxide Fixation and the Synthesis of Carbohydrate 226 Carbohydrate Synthesis in C3 Plants 226 Carbohydrate Synthesis in C4 Plants 231 Carbohydrate Synthesis in CAM Plants 232

7 Interactions Between Cells and Their Environment 235 7.1 The Extracellular Space 236 The Extracellular Matrix 236

7.2 Interactions of Cells with Extracellular Materials 244 Integrins 244 Focal Adhesions and Hemidesmosomes: Anchoring Cells to Their Substratum 247

7.3 Interactions of Cells with Other Cells 250 Selectins 251 The Immunoglobulin Superfamily 252 Cadherins 253 ● TH E H U MAN P E R S P ECTIVE: The Role of Cell Adhesion in Inflammation and Metastasis 255 Adherens Junctions and Desmosomes: Anchoring Cells to Other Cells 257 The Role of Cell-Adhesion Receptors in Transmembrane Signaling 259

5.4 Translocation of Protons and the Establishment of a Proton-Motive Force 198

7.4 Tight Junctions: Sealing The Extracellular Space 260

5.5 The Machinery for ATP Formation 199

7.5 Gap Junctions and Plasmodesmata: Mediating Intercellular Communication 262

The Structure of ATP Synthase 200 The Basis of ATP Formation According to the Binding Change Mechanism 201

211

Plasmodesmata 265

7.6 Cell Walls 266

xiii The Use of Live-Cell Fluorescence Imaging 326 The Use of In Vitro and In Vivo Single-Molecule Assays 327 The Use of Fluorescence Imaging Techniques to Monitor the Dynamics of the Cytoskeleton 329

8 Cytoplasmic Membrane Systems: Structure, Function, and Membrane Trafficking 270 8.1 An Overview of the Endomembrane System 271 8.2 A Few Approaches to the Study of Endomembranes 273 Insights Gained from Autoradiography 273 Insights Gained from the Use of the Green Fluorescent Protein 273 Insights Gained from the Biochemical Analysis of Subcellular Fractions 275 Insights Gained from the Use of Cell-Free Systems 276 Insights Gained from the Study of Mutant Phenotypes 277

8.3 The Endoplasmic Reticulum 279 The Smooth Endoplasmic Reticulum 280 Functions of the Rough Endoplasmic Reticulum 280 From the ER to the Golgi Complex: The First Step in Vesicular Transport 289

8.4 The Golgi Complex 290 Glycosylation in the Golgi Complex 292 The Movement of Materials through the Golgi Complex 292

8.5 Types of Vesicle Transport and Their Functions 295 COPII-Coated Vesicles: Transporting Cargo from the ER to the Golgi Complex 296 COPI-Coated Vesicles: Transporting Escaped Proteins Back to the ER 298 Beyond the Golgi Complex: Sorting Proteins at the TGN 298 Targeting Vesicles to a Particular Compartment 300

8.6 Lysosomes 303 Autophagy 304 ● TH E H U MAN P E R S P ECTIVE: Disorders Resulting from Defects in Lysosomal Function 306

8.7 Plant Cell Vacuoles 307 8.8 The Endocytic Pathway: Moving Membrane and Materials into the Cell Interior 308 Endocytosis 308 Phagocytosis 315

8.9 Posttranslational Uptake of Proteins by Peroxisomes, Mitochondria, and Chloroplasts 316 Uptake of Proteins into Peroxisomes 316 Uptake of Proteins into Mitochondria 316 Uptake of Proteins into Chloroplasts 318 ● E X P E R I M E N TA L PAT H WAYS : Receptor-Mediated Endocytosis 319

9 The Cytoskeleton and Cell Motility

9.2 The Study of the Cytoskeleton 326

Structure and Composition 330 Microtubule-Associated Proteins 331 Microtubules as Structural Supports and Organizers 332 Microtubules as Agents of Intracellular Motility 333 Motor Proteins that Traverse the Microtubular Cytoskeleton 334 Microtubule-Organizing Centers (MTOCs) 339 The Dynamic Properties of Microtubules 341 Cilia and Flagella: Structure and Function 345 ● TH E H U MAN P E R S P ECTIVE: The Role of Cilia in Development and Disease 349

9.4 Intermediate Filaments 354 Intermediate Filament Assembly and Disassembly 354 Types and Functions of Intermediate Filaments 356

9.5 Microfilaments 356 Microfilament Assembly and Disassembly 358 Myosin: The Molecular Motor of Actin Filaments 360

9.6 Muscle Contractility 364 The Sliding Filament Model of Muscle Contraction 366

9.7 Nonmuscle Motility 371 Actin-Binding Proteins 372 Examples of Nonmuscle Motility and Contractility 374

10 The Nature of the Gene and the Genome 386 10.1 The Concept of a Gene as a Unit of Inheritance 387 10.2 Chromosomes: The Physical Carriers of the Genes 388 The Discovery of Chromosomes 388 Chromosomes as the Carriers of Genetic Information 389 Genetic Analysis in Drosophila 390 Crossing Over and Recombination 390 Mutagenesis and Giant Chromosomes 392

10.3 The Chemical Nature of the Gene 393 The Structure of DNA 393 The Watson-Crick Proposal 394 DNA Supercoiling 397

10.4 The Structure of the Genome 398 The Complexity of the Genome 399 ● TH E H U MAN P E R S P ECTIVE: Diseases that Result from Expansion of Trinucleotide Repeats 404

10.5 The Stability of the Genome 406 Whole-Genome Duplication (Polyploidization) 406 Duplication and Modification of DNA Sequences 407 “Jumping Genes” and the Dynamic Nature of the Genome 408

Contents

9.1 Overview of the Major Functions of the Cytoskeleton 325

324

9.3 Microtubules 330

xiv 10.6 Sequencing Genomes: The Footprints of Biological Evolution 411 Comparative Genomics: “If It’s Conserved, It Must Be Important” 413 The Genetic Basis of “Being Human” 414 Genetic Variation Within the Human Species Population 416 ● TH E H U MAN P E R S P ECTIVE: Application of Genomic Analyses to Medicine 417 ● E X P E R I M E N TA L PAT H WAYS : The Chemical Nature of the Gene 420

11 Gene Expression: From Transcription to Translation 426 11.1 The Relationship between Genes, Proteins, and RNAs 427 An Overview of the Flow of Information through the Cell 428

11.2 An Overview of Transcription in Both Prokaryotic and Eukaryotic Cells 429 Transcription in Bacteria 432 Transcription and RNA Processing in Eukaryotic Cells 433

11.3 Synthesis and Processing of Eukaryotic Ribosomal and Transfer RNAs 435 Synthesizing the rRNA Precursor 436 Processing the rRNA Precursor 437 Synthesis and Processing of the 5S rRNA 440 Transfer RNAs 440

11.4 Synthesis and Processing of Eukaryotic Messenger RNAs 441 The Machinery for mRNA Transcription 441 Split Genes: An Unexpected Finding 444 The Processing of Eukaryotic Messenger RNAs 448 Evolutionary Implications of Split Genes and RNA Splicing 454 Creating New Ribozymes in the Laboratory 454

11.5 Small Regulatory RNAs and RNA Silencing Pathways 455 ● TH E H U MAN P E R S P ECTIVE: Clinical Applications of RNA Interference 458 MicroRNAs: Small RNAs that Regulate Gene Expression 459 piRNAs: A Class of Small RNAs that Function in Germ Cells 460 Other Noncoding RNAs 461

11.6 Encoding Genetic Information 461 The Properties of the Genetic Code 461

11.7 Decoding the Codons: The Role of Transfer RNAs 464 The Structure of tRNAs 465

Contents

11.8 Translating Genetic Information 468 Initiation 468 Elongation 471

Termination 474 mRNA Surveillance and Quality Control 474 Polyribosomes 475 ● E X P E R I M E NTAL PATH WAYS: The Role of RNA as a Catalyst 477

12 Control of Gene Expression

483

12.1 Control of Gene Expression in Bacteria 484 Organization of Bacterial Genomes 484 The Bacterial Operon 484 Riboswitches 487

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus 488 The Nuclear Envelope 488 Chromosomes and Chromatin 493 ● TH E H U MAN P E R S P ECTIVE: Chromosomal Aberrations and Human Disorders 504 Epigenetics: There’s More to Inheritance than DNA 509 The Nucleus as an Organized Organelle 510

12.3 An Overview of Gene Regulation in Eukaryotes 512 12.4 Transcriptional Control 514 The Role of Transcription Factors in Regulating Gene Expression 517 The Structure of Transcription Factors 519 DNA Sites Involved in Regulating Transcription 522 Transcriptional Activation: The Role of Enhancers, Promoters, and Coactivators 525 Transcriptional Repression 530

12.5 RNA Processing Control 533 12.6 Translational Control 536 Initiation of Translation 536 Cytoplasmic Localization of mRNAs 537 The Control of mRNA Stability 538 The Role of MicroRNAs in Translational Control 539

12.7 Posttranslational Control: Determining Protein Stability 541

13 DNA Replication and Repair

545

13.1 DNA Replication 546 Semiconservative Replication 546 Replication in Bacterial Cells 549 The Structure and Functions of DNA Polymerases 554 Replication in Eukaryotic Cells 558

13.2 DNA Repair 564 Nucleotide Excision Repair 565 Base Excision Repair 566 Mismatch Repair 567 Double-Strand Breakage Repair 567

xv 13.3 Between Replication and Repair 568 ● TH E H U MAN P E R S P ECTIVE: The Consequences of DNA Repair Deficiencies 569

14 Cellular Reproduction

572

14.1 The Cell Cycle 573 Cell Cycles in Vivo 574 Control of the Cell Cycle 574

15.6 Convergence, Divergence, and Cross-Talk Among Different Signaling Pathways 653 Examples of Convergence, Divergence, and Cross-Talk Among Signaling Pathways 654

15.7 The Role of NO as an Intercellular Messenger 655 15.8 Apoptosis (Programmed Cell Death) 656 The Extrinsic Pathway of Apoptosis 658 The Intrinsic Pathway of Apoptosis 659

14.2 M Phase: Mitosis and Cytokinesis 581 Prophase 583 Prometaphase 588 Metaphase 590 Anaphase 592 Telophase 597 Motor Proteins Required for Mitotic Movements 597 Cytokinesis 597

14.3 Meiosis 602 The Stages of Meiosis 603 ● TH E H U MAN P E R S P ECTIVE: Meiotic Nondisjunction and Its Consequences 608 Genetic Recombination During Meiosis 610 ● E X P E R I M E N TA L PAT H WAYS : The Discovery and Characterization of MPF 611

15 Cell Signaling and Signal Transduction: Communication Between Cells 617 15.1 The Basic Elements of Cell Signaling Systems 618 15.2 A Survey of Extracellular Messengers and Their Receptors 621 15.3 G Protein-Coupled Receptors and Their Second Messengers 621 Signal Transduction by G Protein-Coupled Receptors 622 ● TH E H U MAN P E R S P ECTIVE: Disorders Associated with G Protein-Coupled Receptors 625 Second Messengers 627 The Specificity of G Protein-Coupled Responses 630 Regulation of Blood Glucose Levels 631 The Role of GPCRs in Sensory Perception 634

15.4 Protein-Tyrosine Phosphorylation as a Mechanism for Signal Transduction 636 The Ras-MAP Kinase Pathway 640 Signaling by the Insulin Receptor 644 ● TH E H U MAN P E R S P ECTIVE: Signaling Pathways and Human Longevity 647 Signaling Pathways in Plants 648

Regulating Calcium Concentrations in Plant Cells 652

16.1 Basic Properties of a Cancer Cell 665 16.2 The Causes of Cancer 667 16.3 The Genetics of Cancer 669 Tumor-Suppressor Genes and Oncogenes: Brakes and Accelerators 671 The Cancer Genome 683 Gene-Expression Analysis 685

16.4 New Strategies for Combating Cancer 687 Immunotherapy 688 Inhibiting the Activity of Cancer-Promoting Proteins 689 Inhibiting the Formation of New Blood Vessels (Angiogenesis) 692 ● E X P E R I M E NTAL PATH WAYS: The Discovery of Oncogenes 694

17 The Immune Response

699

17.1 An Overview of the Immune Response 700 Innate Immune Responses 700 Adaptive Immune Responses 703

17.2 The Clonal Selection Theory as It Applies to B Cells 704 Vaccination 706

17.3 T Lymphocytes: Activation and Mechanism of Action 707 17.4 Selected Topics on the Cellular and Molecular Basis of Immunity 710 The Modular Structure of Antibodies 710 DNA Rearrangements that Produce Genes Encoding B- and T-Cell Antigen Receptors 713 Membrane-Bound Antigen Receptor Complexes 716 The Major Histocompatibility Complex 716 Distinguishing Self from Nonself 721 Lymphocytes Are Activated by Cell-Surface Signals 722 Signal Transduction Pathways in Lymphocyte Activation 723 ● TH E H U MAN P E R S P ECTIVE: Autoimmune Diseases 724 ● E X P E R I M E NTAL PATH WAYS: The Role of the Major Histocompatibility Complex in Antigen Presentation 727

Contents

15.5 The Role of Calcium as an Intracellular Messenger 648

16 Cancer 664

xvi

18 Techniques in Cell and Molecular Biology 732

18.9 Fractionation of Nucleic Acids 760

18.1 The Light Microscope 733

18.10 Nucleic Acid Hybridization 762

Resolution 733 Visibility 734 Preparation of Specimens for Bright-Field Light Microscopy 735 Phase-Contrast Microscopy 735 Fluorescence Microscopy (and Related Fluorescence-Based Techniques) 736 Video Microscopy and Image Processing 738 Laser Scanning Confocal Microscopy 739 Super-Resolution Fluorescence Microscopy 740

18.2 Transmission Electron Microscopy 740 Specimen Preparation for Electron Microscopy 742

18.3 Scanning Electron and Atomic Force Microscopy 746 Atomic Force Microscopy 748

18.4 The Use of Radioisotopes 748 18.5 Cell Culture 749 18.6 The Fractionation of a Cell’s Contents by Differential Centrifugation 752 18.7 Isolation, Purification, and Fractionation of Proteins 752 Selective Precipitation 752 Liquid Column Chromatography 753 Polyacrylamide Gel Electrophoresis 756 Protein Measurement and Analysis 757

Contents

18.8 Determining the Structure of Proteins and Multisubunit Complexes 758

Separation of DNAs by Gel Electrophoresis 760 Separation of Nucleic Acids by Ultracentrifugation 760

18.11 Chemical Synthesis of DNA 764 18.12 Recombinant DNA Technology 764 Restriction Endonucleases 764 Formation of Recombinant DNAs 766 DNA Cloning 766

18.13 Enzymatic Amplification of DNA by PCR 769 Applications of PCR 770

18.14 DNA Sequencing 771 18.15 DNA Libraries 773 Genomic Libraries 773 cDNA Libraries 774

18.16 DNA Transfer into Eukaryotic Cells and Mammalian Embryos 775 18.17 Determining Eukaryotic Gene Function by Gene Elimination or Silencing 778 In Vitro Mutagenesis 778 Knockout Mice 778 RNA Interference 780

18.18 The Use of Antibodies 780

Glossary G-1 Additional Readings A-1 Index I-1

1

1 Introduction to the Study of Cell and Molecular Biology 1.1 1.2 1.3 1.4

The Discovery of Cells Basic Properties of Cells Two Fundamentally Different Classes of Cells Viruses THE HUMAN PERSPECTIVE: The Prospect of Cell Replacement Therapy EXPERIMENTAL PATHWAYS: The Origin of Eukaryotic Cells

Cells, and the structures they comprise, are too small to be directly seen, heard, or touched. In spite of this tremendous handicap, cells are the subject of hundreds of thousands of publications each year, with virtually every aspect of their minuscule structure coming under scrutiny. In many ways, the study of cell and molecular biology stands as a tribute to human curiosity for seeking to discover, and to human creative intelligence for devising the complex instruments and elaborate techniques by which these discoveries can be made. This is not to imply that cell and molecular biologists have a monopoly on these noble traits. At one end of the scientific spectrum, astronomers are utilizing an orbiting telescope to capture images of primordial galaxies that are so far from earth they appear to us today as they existed more than 13 billion years ago, only a few hundred million years after the Big Bang. At the other end of the spectrum, nuclear physicists have recently forced protons to collide with one another at velocities approaching the speed of light, confirming the existence of a hypothesized particle—the Higgs boson—that is

An example of the role of technological innovation in the field of cell biology. This light micrograph shows a cell lying on a microscopic bed of synthetic posts. The flexible posts serve as sensors to measure mechanical forces exerted by the cell. The red-stained elements are bundles of actin filaments within the cell that generate forces during cell locomotion. When the cell moves, it pulls on the attached posts, which report the amount of strain they are experiencing. The cell nucleus is stained green. (FROM J. L. TAN ET AL., PROC. NAT ’L. ACAD. SCI. U.S.A., 100 (4), © 2003 NATIONAL ACADEMY OF SCIENCES, U.S.A. COURTESY OF CHRISTOPHER S. CHEN, THE JOHNS HOPKINS UNIVERSITY.)

Chapter 1 Introduction to the Study of Cell and Molecular Biology

2 proposed to endow all other subatomic particles with mass. Clearly, our universe consists of worlds within worlds, all aspects of which make for fascinating study. As will be apparent throughout this book, cell and molecular biology is reductionist ; that is, it is based on the view that knowledge of the parts of the whole can explain the character of the

whole. When viewed in this way, our feeling for the wonder and mystery of life may be replaced by the need to explain everything in terms of the workings of the “machinery” of the living system. To the degree to which this occurs, it is hoped that this loss can be replaced by an equally strong appreciation for the beauty and complexity of the mechanisms underlying cellular activity.

1.1 | The Discovery of Cells

the structure of various tissues, plants were made of cells and that the plant embryo arose from a single cell. In 1839, Theodor Schwann, a German zoologist and colleague of Schleiden’s, published a comprehensive report on the cellular basis of animal life. Schwann concluded that the cells of plants

Because of their small size, cells can only be observed with the aid of a microscope, an instrument that provides a magnified image of a tiny object. We do not know when humans first discovered the remarkable ability of curved-glass surfaces to bend light and form images. Spectacles were first made in Europe in the thirteenth century, and the first compound (double-lens) light microscopes were constructed by the end of the sixteenth century. By the mid-1600s, a handful of pioneering scientists had used their handmade microscopes to uncover a world that would never have been revealed to the naked eye. The discovery of cells (Figure 1.1a) is generally credited to Robert Hooke, an English microscopist who, at age 27, was awarded the position of curator of the Royal Society of London, England’s foremost scientific academy. One of the many questions Hooke attempted to answer was why stoppers made of cork (part of the bark of trees) were so well suited to holding air in a bottle. As he wrote in 1665: “I took a good clear piece of cork, and with a Pen-knife sharpen’d as keen as a Razor, I cut a piece of it off, and . . . then examining it with a Microscope, me thought I could perceive it to appear a little porous . . . much like a Honeycomb.” Hooke called the pores cells because they reminded him of the cells inhabited by monks living in a monastery. In actual fact, Hooke had observed the empty cell walls of dead plant tissue, walls that had originally been produced by the living cells they surrounded. Meanwhile, Anton van Leeuwenhoek, a Dutchman who earned a living selling clothes and buttons, was spending his spare time grinding lenses and constructing simple microscopes of remarkable quality (Figure 1.1b). For 50 years, Leeuwenhoek sent letters to the Royal Society of London describing his microscopic observations—along with a rambling discourse on his daily habits and the state of his health. Leeuwenhoek was the first to examine a drop of pond water under the microscope and, to his amazement, observe the teeming microscopic “animalcules” that darted back and forth before his eyes. He was also the first to describe various forms of bacteria, which he obtained from water in which pepper had been soaked and from scrapings of his teeth. His initial letters to the Royal Society describing this previously unseen world were met with such skepticism that the society dispatched its curator, Robert Hooke, to confirm the observations. Hooke did just that, and Leeuwenhoek was soon a worldwide celebrity, receiving visits in Holland from Peter the Great of Russia and the queen of England. It wasn’t until the 1830s that the widespread importance of cells was realized. In 1838, Matthias Schleiden, a German lawyer turned botanist, concluded that, despite differences in

(a)

(b)

Figure 1.1 The discovery of cells. (a) One of Robert Hooke’s more ornate compound (double-lens) microscopes. (Inset) Hooke’s drawing of a thin slice of cork, showing the honeycomb-like network of “cells.” (b) Single-lens microscope used by Anton van Leeuwenhoek to observe bacteria and other microorganisms. The biconvex lens, which was capable of magnifying an object approximately 270 times and providing a resolution of approximately 1.35 ␮m, was held between two metal plates. (A: THE GRANGER COLLECTION, NEW YORK; INSET BIOPHOTO ASSOCIATES/GETTY IMAGES, INC.; B: © BETTMANN/ CORBIS)

3

and animals are similar structures and proposed these two tenets of the cell theory: ■ ■

All organisms are composed of one or more cells. The cell is the structural unit of life.

Schleiden and Schwann’s ideas on the origin of cells proved to be less insightful; both agreed that cells could arise from noncellular materials. Given the prominence that these two scientists held in the scientific world, it took a number of years before observations by other biologists were accepted as demonstrating that cells did not arise in this manner any more than organisms arose by spontaneous generation. By 1855, Rudolf Virchow, a German pathologist, had made a convincing case for the third tenet of the cell theory: ■

Cells can arise only by division from a preexisting cell.

1.2 | Basic Properties of Cells Just as plants and animals are alive, so too are cells. Life, in fact, is the most basic property of cells, and cells are the smallest units to exhibit this property. Unlike the parts of a cell, which simply deteriorate if isolated, whole cells can be removed from a plant or animal and cultured in a laboratory where they will grow and reproduce for extended periods of time. If mistreated, they may die. Death can also be considered one of the most basic properties of life, because only a living entity faces this prospect. Remarkably, cells within the body generally die “by their own hand”—the victims of an internal program that causes cells that are no longer needed or cells that pose a risk of becoming cancerous to eliminate themselves. The first culture of human cells was begun by George and Martha Gey of Johns Hopkins University in 1951. The cells were obtained from a malignant tumor and named HeLa cells after the donor, Henrietta Lacks. HeLa cells—descended by cell division from this first cell sample—are still being grown in laboratories around the world today (Figure 1.2). Because they are so much simpler to study than cells situated within the body, cells grown in vitro (i.e., in culture, outside the body) have become an essential tool of cell and molecular biologists. In fact, much of the information that will be discussed in this book has been obtained using cells grown in laboratory cultures. We will begin our exploration of cells by examining a few of their most fundamental properties.

Complexity is a property that is evident when encountered, but difficult to describe. For the present, we can think of complexity in terms of order and consistency. The more complex a structure, the greater the number of parts that must be in their proper place, the less tolerance of errors in the nature and interactions of the parts, and the more regulation or control that must be exerted to maintain the system. Cellular activities can be remarkably precise. DNA duplication, for example, occurs

with an error rate of less than one mistake every ten million nucleotides incorporated—and most of these are quickly corrected by an elaborate repair mechanism that recognizes the defect. During the course of this book, we will have occasion to consider the complexity of life at several different levels. We will discuss the organization of atoms into small-sized molecules; the organization of these molecules into giant polymers; and the organization of different types of polymeric molecules into complexes, which in turn are organized into subcellular organelles and finally into cells. As will be apparent, there is a great deal of consistency at every level. Each type of cell has a consistent appearance when viewed under a high-powered electron microscope; that is, its organelles have a particular shape and location, from one individual of a species to another. Similarly, each type of organelle has a consistent composition of macromolecules, which are arranged in a predictable pattern. Consider the cells lining your intestine that are responsible for removing nutrients from your digestive tract (Figure 1.3). The epithelial cells that line the intestine are tightly connected to each other like bricks in a wall. The apical ends of these cells, which face the intestinal channel, have long processes (microvilli) that facilitate absorption of nutrients. The microvilli are able to project outward from the apical cell surface because they contain an internal skeleton made of filaments, which in turn are composed of protein (actin) monomers polymerized in a characteristic array. At their basal ends, intestinal cells have large numbers of mitochondria that provide the energy required to fuel various membrane transport processes. Each mitochondrion is composed of a defined pattern of internal membranes, which in turn are composed of

1.2 Basic Properties of Cells

Cells Are Highly Complex and Organized

Figure 1.2 HeLa cells, such as the ones pictured here, were the first human cells to be kept in culture for long periods of time and are still in use today. Unlike normal cells, which have a finite lifetime in culture, these cancerous HeLa cells can be cultured indefinitely as long as conditions are favorable to support cell growth and division. (TORSTEN WITTMANN/PHOTO RESEARCHERS, INC.)

4 Villus of the small intestinal wall

Inset 7

Apical microvilli

Inset 6 50 Å

Inset 2

Inset 5

Chapter 1 Introduction to the Study of Cell and Molecular Biology

25 nm

Mitochondria Inset 3

Figure 1.3 Levels of cellular and molecular organization. The brightly colored photograph of a stained section shows the microscopic structure of a villus of the wall of the small intestine, as seen through the light microscope. Inset 1 shows an electron micrograph of the epithelial layer of cells that lines the inner intestinal wall. The apical surface of each cell, which faces the channel of the intestine, contains a large number of microvilli involved in nutrient absorption. The basal region of each cell contains large numbers of mitochondria, where energy is made available to the cell. Inset 2 shows the apical microvilli; each microvillus contains a bundle of microfilaments. Inset 3 shows the actin protein subunits that make up each microfilament. Inset 4 shows an individual mitochondrion similar to those found in the basal region of the epithelial cells. Inset 5 shows a portion of an inner membrane of a mitochondrion including the stalked particles (upper arrow) that

Inset 1

Inset 4

project from the membrane and correspond to the sites where ATP is synthesized. Insets 6 and 7 show molecular models of the ATPsynthesizing machinery, which is discussed at length in Chapter 5. (LIGHT MICROGRAPH CECIL FOX/PHOTO RESEARCHERS; INSET 1 COURTESY OF SHAKTI P. KAPUR, GEORGETOWN UNIVERSITY MEDICAL CENTER; INSET 2 FROM MARK S. MOOSEKER AND LEWIS G. TILNEY, J. CELL BIOL. 67:729, 1975, REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS; INSET 3 COURTESY OF KENNETH C. HOLMES; INSET 4 KEITH R. PORTER/ PHOTO RESEARCHERS; INSET 5 COURTESY OF HUMBERTO FERNANDEZ-MORAN; INSET 6 COURTESY OF RODERICK A. CAPALDI; INSET 7 COURTESY OF WOLFGANG JUNGE, HOLGER LILL, AND SIEGFRIED ENGELBRECHT, UNIVERSITY OF OSNABRÜCK, GERMANY.)

5

a consistent array of proteins, including an electrically powered ATP-synthesizing machine that projects from the inner membrane like a ball on a stick. Each of these various levels of organization is illustrated in the insets of Figure 1.3. Fortunately for cell and molecular biologists, evolution has moved rather slowly at the levels of biological organization with which they are concerned. Whereas a human and a cat, for example, have very different anatomical features, the cells that make up their tissues, and the organelles that make up their cells, are very similar. The actin filament portrayed in Figure 1.3, Inset 3, and the ATP-synthesizing enzyme of Inset 6 are virtually identical to similar structures found in such diverse organisms as humans, snails, yeast, and redwood trees. Information obtained by studying cells from one type of organism often has direct application to other forms of life. Many of the most basic processes, such as the synthesis of proteins, the conservation of chemical energy, or the construction of a membrane, are remarkably similar in all living organisms.

Figure 1.4 Cell reproduction. This mammalian oocyte has recently undergone a highly unequal cell division in which most of the cytoplasm has been retained within the large oocyte, which has the potential to be fertilized and develop into an embryo. The other cell is a nonfunctional remnant that consists almost totally of nuclear material (indicated by the blue-staining chromosomes, arrow). (COURTESY OF JONATHAN VAN BLERKOM.)

Cells Possess a Genetic Program and the Means to Use It Organisms are built according to information encoded in a collection of genes, which are constructed of DNA. The human genetic program contains enough information, if converted to words, to fill millions of pages of text. Remarkably, this vast amount of information is packaged into a set of chromosomes that occupies the space of a cell nucleus—hundreds of times smaller than the dot on this i. Genes are more than storage lockers for information: they constitute the blueprints for constructing cellular structures, the directions for running cellular activities, and the program for making more of themselves. The molecular structure of genes allows for changes in genetic information (mutations) that lead to variation among individuals, which forms the basis of biological evolution. Discovering the mechanisms by which cells use and transmit their genetic information has been one of the greatest achievements of science in recent decades.

energy of light is trapped by light-absorbing pigments present in the membranes of photosynthetic cells (Figure 1.5). Light energy is converted by photosynthesis into chemical energy that is stored in energy-rich carbohydrates, such as sucrose or starch. For most animal cells, energy arrives prepackaged, often in the form of the sugar glucose. In humans, glucose is released by the liver into the blood where it circulates through the body delivering chemical energy to all the cells. Once in a cell, the glucose is disassembled in such a way that its energy content can be stored in a readily available form (usually as ATP) that is later put to use in running all of the cell’s myriad energy-requiring activities. Cells expend an enormous amount of energy simply breaking down and rebuilding the macromolecules and organelles of which they are made. This continual “turnover,” as it is called, maintains the integrity of cell components in the face of inevitable wear and tear and enables the cell to respond rapidly to changing conditions.

Cells Are Capable of Producing More of Themselves

Cells Acquire and Utilize Energy Every biological process requires the input of energy. Virtually all of the energy utilized by life on the Earth’s surface arrives in the form of electromagnetic radiation from the sun. The

Figure 1.5 Acquiring energy. A living cell of the filamentous alga Spirogyra. The ribbon-like chloroplast, which is seen to zigzag through the cell, is the site where energy from sunlight is captured and converted to chemical energy during photosynthesis. (M. I. WALKER/ PHOTO RESEARCHERS, INC.)

1.2 Basic Properties of Cells

Just as individual organisms are generated by reproduction, so too are individual cells. Cells reproduce by division, a process in which the contents of a “mother” cell are distributed into two “daughter” cells. Prior to division, the genetic material is faithfully duplicated, and each daughter cell receives a complete and equal share of genetic information. In most cases, the two daughter cells have approximately equal volume. In some cases, however, as occurs when a human oocyte undergoes division, one of the cells can retain nearly all of the cytoplasm, even though it receives only half of the genetic material (Figure 1.4).

6

Cells Carry Out a Variety of Chemical Reactions

Normal development

Experimental result

Cells function like miniaturized chemical plants. Even the simplest bacterial cell is capable of hundreds of different chemical transformations, none of which occurs at any significant rate in the inanimate world. Virtually all chemical changes that take place in cells require enzymes—molecules that greatly increase the rate at which a chemical reaction occurs. The sum total of the chemical reactions in a cell represents that cell’s metabolism.

Cells Engage in Mechanical Activities Cells are sites of bustling activity. Materials are transported from place to place, structures are assembled and then rapidly disassembled, and, in many cases, the entire cell moves itself from one site to another. These types of activities are based on dynamic, mechanical changes within cells, many of which are initiated by changes in the shape of “motor” proteins. Motor proteins are just one of many types of molecular “machines” employed by cells to carry out mechanical activities.

Chapter 1 Introduction to the Study of Cell and Molecular Biology

Cells Are Able to Respond to Stimuli Some cells respond to stimuli in obvious ways; a single-celled protist, for example, moves away from an object in its path or moves toward a source of nutrients. Cells within a multicellular plant or animal respond to stimuli less obviously. Most cells are covered with receptors that interact with substances in the environment in highly specific ways. Cells possess receptors to hormones, growth factors, and extracellular materials, as well as to substances on the surfaces of other cells. A cell’s receptors provide pathways through which external stimuli can evoke specific responses in target cells. Cells may respond to specific stimuli by altering their metabolic activities, moving from one place to another, or even committing suicide.

Cells Are Capable of Self-Regulation In recent years, a new term has been used to describe cells: robustness. Cells are robust, that is, hearty or durable, because they are protected from dangerous fluctuations in composition and behavior. Should such fluctuations occur, specific feedback circuits are activated that serve to return the cell to the appropriate state. In addition to requiring energy, maintaining a complex, ordered state requires constant regulation. The importance of a cell’s regulatory mechanisms becomes most evident when they break down. For example, failure of a cell to correct a mistake when it duplicates its DNA may result in a debilitating mutation, or a breakdown in a cell’s growth-control safeguards can transform the cell into a cancer cell with the capability of destroying the entire organism. We are gradually learning how a cell controls its activities, but much more is left to discover. Consider the following experiment conducted in 1891 by Hans Driesch, a German embryologist. Driesch found that he could completely separate the first two or four cells of a sea urchin embryo and each of the isolated cells would proceed to develop into a normal embryo (Figure 1.6). How can a cell that is normally destined to form only part of an embryo reg-

Figure 1.6 Self-regulation. The left panel depicts the normal development of a sea urchin in which a fertilized egg gives rise to a single embryo. The right panel depicts an experiment in which the cells of an early embryo are separated from one another after the first division, and each cell is allowed to develop in isolation. Rather than developing into half of an embryo, as it would if left undisturbed, each isolated cell recognizes the absence of its neighbor, regulating its development to form a complete (although smaller) embryo.

ulate its own activities and form an entire embryo? How does the isolated cell recognize the absence of its neighbors, and how does this recognition redirect the entire course of the cell’s development? How can a part of an embryo have a sense of the whole? We are not able to answer these questions much better today than we were more than a hundred years ago when the experiment was performed. Throughout this book we will be discussing processes that require a series of ordered steps, much like the assemblyline construction of an automobile in which workers add, remove, or make specific adjustments as the car moves along. In the cell, the information for product design resides in the nucleic acids, and the construction workers are primarily proteins. It is the presence of these two types of macromolecules that, more than any other factor, sets the chemistry of the cell apart from that of the nonliving world. In the cell, the workers must act without the benefit of conscious direction. Each step of a process must occur spontaneously in such a way that the next step is automatically triggered. In many ways, cells operate in a manner analogous to the orange-squeezing contraption discovered by “The Professor” and shown in Figure 1.7. Each type of cellular activity requires a unique set of highly complex molecular tools and machines—the products of eons of natural selection and biological evolution.

7

Figure 1.7 Cellular activities are often analogous to this “Rube Goldberg machine” in which one event “automatically” triggers the next event in a reaction sequence. (RUBE GOLDBERG IS THE ® AND © OF RUBE GOLDBERG, INC.)

REVIEW

A primary goal of biologists is to understand the molecular structure and role of each component involved in a particular activity, the means by which these components interact, and the mechanisms by which these interactions are regulated.

1. List the fundamental properties shared by all cells. Describe the importance of each of these properties. 2. Describe the features of cells that suggest that all living organisms are derived from a common ancestor. 3. What is the source of energy that supports life on Earth? How is this energy passed from one organism to the next?

Cells Evolve

1.3 | Two Fundamentally Different Classes of Cells Once the electron microscope became widely available, biologists were able to examine the internal structure of a wide variety of cells. It became apparent from these studies that there were two basic classes of cells—prokaryotic and eukaryotic— distinguished by their size and the types of internal structures, or organelles, they contain (Figure 1.8). The existence of two distinct classes of cells, without any known intermediates, represents one of the most fundamental evolutionary divisions in the biological world. The structurally simpler prokaryotic cells include bacteria, whereas the structurally more complex eukaryotic cells include protists, fungi, plants, and animals.1 We are not sure when prokaryotic cells first appeared on Earth. Evidence of prokaryotic life has been obtained from rocks approximately 2.7 billion years of age. Not only do these 1

Those interested in examining a proposal to do away with the concept of prokaryotic versus eukaryotic organisms can read a brief essay by N. R. Pace in Nature 441:289, 2006.

1.3 Two Fundamentally Different Classes of Cells

How did cells arise? Of all the major questions posed by biologists, this question may be the least likely ever to be answered. It is presumed that cells evolved from some type of precellular life form, which in turn evolved from nonliving organic materials that were present in the primordial seas. Whereas the origin of cells is shrouded in near-total mystery, the evolution of cells can be studied by examining organisms that are alive today. If you were to observe the features of a bacterial cell living in the human intestinal tract (see Figure 1.18a) and a cell that is part of the lining of that tract (Figure 1.3), you would be struck by the differences between the two cells. Yet both of these cells, as well as all other cells that are present in living organisms, share many features, including a common genetic code, a plasma membrane, and ribosomes. According to one of the tenets of modern biology, all living organisms have evolved from a single, common ancestral cell that lived more than three billion years ago. Because it gave rise to all the living organisms that we know of, this ancient cell is often referred to as the last universal common ancestor (or LUCA). We will examine some of the events that occurred during the evolution of cells in the Experimental Pathways at the end of the chapter. Keep in mind that evolution is not simply an event of the past, but an ongoing process that continues to modify the properties of cells that will be present in organisms that have yet to appear.

8 Capsule Plasma membrane Cell wall DNA of nucleoid Ribosome

Cytoplasm Bacterial flagellum

Pilus

(a)

Nuclear envelope Nucleus Nucleoplasm Nucleolus Rough endoplasmic reticulum Cell wall Plasma membrane Plasmodesma

Chapter 1 Introduction to the Study of Cell and Molecular Biology

Mitochondrion

Chloroplast Smooth endoplasmic reticulum Peroxisome Golgi complex

Vacuole

Ribosomes

Figure 1.8 The structure of cells. Schematic diagrams of a “generalized” bacterial (a), plant (b), and animal (c) cell. Note: Organelles are not drawn to scale. (FROM D. J. DES MARAIS, SCIENCE 289:1704, 2001. COPYRIGHT © 2000. REPRINTED WITH PERMISSION FROM AAAS.)

Vesicle Cytosol Microtubules (b)

rocks contain what appears to be fossilized microbes, they contain complex organic molecules that are characteristic of particular types of prokaryotic organisms, including cyanobacteria. It is unlikely that such molecules could have been synthesized abiotically, that is, without the involvement of living cells. Cyanobacteria almost certainly appeared by 2.4 billion years ago, because that is when the atmosphere become infused with molecular oxygen (O2), which is a byproduct of the photosynthetic activity of these prokaryotes. The dawn of the age of eukaryotic cells is also shrouded in uncertainty. Complex multicellular animals appear rather suddenly in the fossil record approximately 600 million years ago, but there is considerable evidence that simpler eukaryotic organisms were present on Earth more than one billion years

earlier. The estimated time of appearance on Earth of several major groups of organisms is depicted in Figure 1.9. Even a superficial examination of Figure 1.9 reveals how “quickly” life arose following the formation of Earth and cooling of its surface, and how long it took for the subsequent evolution of complex animals and plants.

Characteristics That Distinguish Prokaryotic and Eukaryotic Cells The following brief comparison between prokaryotic and eukaryotic cells reveals many basic differences between the two types, as well as many similarities (see Figure 1.8). The similarities and differences between the two types of cells are

9 Cilium

Flagellum

Nucleus: Cytoskeleton: Microtubule

Proteasome

Chromatin Nuclear pore

Free ribosomes

Microfilament

Nuclear envelope

Intermediate filament

Nucleolus

Microvilli

Glycogen granules

Centrosome: Pericentriolar material

Cytosol

Centrioles Plasma membrane

Rough endoplasmic reticulum (ER)

Secretory vesicle Lysosome

Ribosome attached to ER

Smooth endoplasmic reticulum (ER)

Golgi complex

Peroxisome Mitochondrion Microtubule Microfilament (c)

Figure 1.8 (continued)

Billions of years ago

Cenozoic zoic ic zo

Algal kingdoms 1

4 Life

Precambrian 2

3

Eukaryotes

? Photosynthetic bacteria

Cyanobacteria

1.3 Two Fundamentally Different Classes of Cells

Mammals Humans Vascular plants Origin of Shelly Earth invertebrates leo Pa

Figure 1.9 Earth’s biogeologic clock. A portrait of the past five billion years of Earth’s history showing a proposed time of appearance of major groups of organisms. Complex animals (shelly invertebrates) and vascular plants are relatively recent arrivals. The time indicated for the origin of life is speculative. In addition, photosynthetic bacteria may have arisen much earlier, hence the question mark. The geologic eras are indicated in the center of the illustration. (FROM D. J. DES MARAIS, SCIENCE 289:1704, 2001. COPYRIGHT © 2000. REPRINTED WITH PERMISSION FROM AAAS.)

Internally, eukaryotic cells are much more complex—both structurally and functionally—than prokaryotic cells (Figure 1.8). The difference in structural complexity is evident in

Meso

listed in Table 1.1. The shared properties reflect the fact that eukaryotic cells almost certainly evolved from prokaryotic ancestors. Because of their common ancestry, both types of cells share an identical genetic language, a common set of metabolic pathways, and many common structural features. For example, both types of cells are bounded by plasma membranes of similar construction that serve as a selectively permeable barrier between the living and nonliving worlds. Both types of cells may be surrounded by a rigid, nonliving cell wall that protects the delicate life form within. Although the cell walls of prokaryotes and eukaryotes may have similar functions, their chemical composition is very different.

10

Table 1.1 A Comparison of Prokaryotic and Eukaryotic Cells Features held in common by the two types of cells: ■ ■ ■ ■ ■

■ ■ ■

Plasma membrane of similar construction Genetic information encoded in DNA using identical genetic code Similar mechanisms for transcription and translation of genetic information, including similar ribosomes Shared metabolic pathways (e.g., glycolysis and TCA cycle) Similar apparatus for conservation of chemical energy as ATP (located in the plasma membrane of prokaryotes and the mitochondrial membrane of eukaryotes) Similar mechanism of photosynthesis (between cyanobacteria and green plants) Similar mechanism for synthesizing and inserting membrane proteins Proteasomes (protein digesting structures) of similar construction (between archaebacteria and eukaryotes)

Features of eukaryotic cells not found in prokaryotes: ■ ■ ■

■ ■ ■ ■ ■

Chapter 1 Introduction to the Study of Cell and Molecular Biology

■ ■ ■ ■

Division of cells into nucleus and cytoplasm, separated by a nuclear envelope containing complex pore structures Complex chromosomes composed of DNA and associated proteins that are capable of compacting into mitotic structures Complex membranous cytoplasmic organelles (includes endoplasmic reticulum, Golgi complex, lysosomes, endosomes, peroxisomes, and glyoxisomes) Specialized cytoplasmic organelles for aerobic respiration (mitochondria) and photosynthesis (chloroplasts) Complex cytoskeletal system (including microfilaments, intermediate filaments, and microtubules) and associated motor proteins Complex flagella and cilia Ability to ingest particulate material by enclosure within plasma membrane vesicles (phagocytosis) Cellulose-containing cell walls (in plants) Cell division using a microtubule-containing mitotic spindle that separates chromosomes Presence of two copies of genes per cell (diploidy), one from each parent Presence of three different RNA synthesizing enzymes (RNA polymerases) Sexual reproduction requiring meiosis and fertilization

the electron micrographs of a bacterial and an animal cell shown in Figures 1.18a and 1.10, respectively. Both contain a nuclear region, which houses the cell’s genetic material, surrounded by cytoplasm. The genetic material of a prokaryotic cell is present in a nucleoid: a poorly demarcated region of the cell that lacks a boundary membrane to separate it from the surrounding cytoplasm. In contrast, eukaryotic cells possess a nucleus: a region bounded by a complex membranous structure called the nuclear envelope. This difference in nuclear structure is the basis for the terms prokaryotic (pro ⫽ before, karyon ⫽ nucleus) and eukaryotic (eu ⫽ true, karyon ⫽ nucleus). Prokaryotic cells contain relatively small amounts of DNA; the DNA content of bacteria ranges from about 600,000 base pairs to nearly 8 million base pairs and encodes between about 500 and several thousand proteins.2 Although a “simple” 2

Eight million base pairs is equivalent to a DNA molecule nearly 3 mm long.

baker’s yeast cell has only slightly more DNA (12 million base pairs encoding about 6200 proteins) than the most complex prokaryotes, most eukaryotic cells contain considerably more genetic information. Both prokaryotic and eukaryotic cells have DNA-containing chromosomes. Eukaryotic cells possess a number of separate chromosomes, each containing a single linear molecule of DNA. In contrast, nearly all prokaryotes that have been studied contain a single, circular chromosome. More importantly, the chromosomal DNA of eukaryotes, unlike that of prokaryotes, is tightly associated with proteins to form a complex nucleoprotein material known as chromatin. The cytoplasm of the two types of cells is also very different. The cytoplasm of a eukaryotic cell is filled with a great diversity of structures, as is readily apparent by examining an electron micrograph of nearly any plant or animal cell (Figure 1.10). Even yeast, the simplest eukaryote, is much more complex structurally than an average bacterium (compare Figures 1.18a and b), even though these two organisms have a similar number of genes. Eukaryotic cells contain an array of membrane-bound organelles. Eukaryotic organelles include mitochondria, where chemical energy is made available to fuel cellular activities; an endoplasmic reticulum, where many of a cell’s proteins and lipids are manufactured; Golgi complexes, where materials are sorted, modified, and transported to specific cellular destinations; and a variety of simple membrane-bound vesicles of varying dimension. Plant cells contain additional membranous organelles, including chloroplasts, which are the sites of photosynthesis, and often a single large vacuole that can occupy most of the volume of the cell. Taken as a group, the membranes of the eukaryotic cell serve to divide the cytoplasm into compartments within which specialized activities can take place. In contrast, the cytoplasm of prokaryotic cells is essentially devoid of membranous structures. The complex photosynthetic membranes of the cyanobacteria are a major exception to this generalization (see Figure 1.15). The cytoplasmic membranes of eukaryotic cells form a system of interconnecting channels and vesicles that function in the transport of substances from one part of a cell to another, as well as between the inside of the cell and its environment. Because of their small size, directed intracytoplasmic communication is less important in prokaryotic cells, where the necessary movement of materials can be accomplished by simple diffusion. Eukaryotic cells also contain numerous structures lacking a surrounding membrane. Included in this group are the elongated tubules and filaments of the cytoskeleton, which participate in cell contractility, movement, and support. It was thought for many years that prokaryotic cells lacked any trace of a cytoskeleton, but primitive cytoskeletal filaments have been found in bacteria. It is still fair to say that the prokaryotic cytoskeleton is much simpler, both structurally and functionally, than that of eukaryotes. Both eukaryotic and prokaryotic cells possess ribosomes, which are nonmembranous particles that function as “workbenches” on which the proteins of the cell are manufactured. Even though ribosomes of prokaryotic and eukaryotic cells have considerably different dimensions (those of prokaryotes are smaller and

11 Cytoskeletal filament Ribosome Cytosol

Lysosome

Plasma membrane

Golgi complex

Nucleus

Smooth endoplasmic reticulum

Chromatin

Nucleolus

Rough endoplasmic reticulum

Figure 1.10 The structure of a eukaryotic cell. This epithelial cell lines the male reproductive tract in the rat. A number of different organelles are indicated and depicted in schematic diagrams around the border of the figure. (DAVID M. PHILLIPS/PHOTO RESEARCHERS, INC.)

1.3 Two Fundamentally Different Classes of Cells

Mitochondrion

12

Figure 1.12 Cell division in eukaryotes requires the assembly of an elaborate chromosome-separating apparatus called the mitotic spindle, which is constructed primarily of microtubules. The microtubules in this micrograph appear green because they are bound by an antibody that is linked to a green fluorescent dye. The chromosomes, which were about to be separated into two daughter cells when this cell was fixed, are stained blue. (COURTESY OF CONLY L. RIEDER.)

Chapter 1 Introduction to the Study of Cell and Molecular Biology

Figure 1.11 The cytoplasm of a eukaryotic cell is a crowded compartment. This colorized electron micrographic image shows a small region near the edge of a single-celled eukaryotic organism that had been quickly frozen prior to microscopic examination. The three-dimensional appearance is made possible by capturing twodimensional digital images of the specimen at different angles and merging the individual frames using a computer. Cytoskeletal filaments are shown in red, macromolecular complexes (primarily ribosomes) are green, and portions of cell membranes are blue. (FROM OHAD MEDALIA ET AL., SCIENCE 298:1211, 2002, FIGURE 3A. © 2002, REPRINTED WITH PERMISSION FROM AAAS. PHOTO PROVIDED COURTESY OF WOLFGANG BAUMEISTER.)

contain fewer components), these structures participate in the assembly of proteins by a similar mechanism in both types of cells. Figure 1.11 is a colorized electron micrograph of a portion of the cytoplasm near the thin edge of a singlecelled eukaryotic organism. This is a region of the cell where membrane-bound organelles tend to be absent. The micrograph shows individual filaments of the cytoskeleton (red) and other large macromolecular complexes of the cytoplasm (green). Most of these complexes are ribosomes. It is evident from this type of image that the cytoplasm of a eukaryotic cell is extremely crowded, leaving very little space for the soluble phase of the cytoplasm, which is called the cytosol. Other major differences between eukaryotic and prokaryotic cells can be noted. Eukaryotic cells divide by a complex process of mitosis in which duplicated chromosomes condense into compact structures that are segregated by an elaborate microtubule-containing apparatus (Figure 1.12). This apparatus, which is called a mitotic spindle, allows each daughter cell to receive an equivalent array of genetic material. In prokaryotes, there is no compaction of the chromosome and no mitotic spindle. The DNA is duplicated, and the two copies are separated accurately by the growth of an intervening cell membrane.

For the most part, prokaryotes are nonsexual organisms. They contain only one copy of their single chromosome and have no processes comparable to meiosis, gamete formation, or true fertilization. Even though true sexual reproduction is lacking among prokaryotes, some are capable of conjugation, in which a piece of DNA is passed from one cell to another (Figure 1.13). However, the recipient almost never receives a whole chromosome from the donor, and the condition in which the recipient cell contains both its own and its partner’s DNA is fleeting. The cell soon reverts back to possession of a single chromosome. Although prokaryotes may not be as efficient as eukaryotes in exchanging DNA with other members of their own species, they are more adept than eukaryotes at picking up and incorporating foreign DNA from their environment, which has had considerable impact on microbial evolution (page 29). Eukaryotic cells possess a variety of complex locomotor mechanisms, whereas those of prokaryotes are relatively simple. The movement of a prokaryotic cell may be accomplished by a thin protein filament, called a flagellum, which protrudes from the cell and rotates (Figure 1.14a). The rotations of the flagellum, which can exceed 1000 times per second, exert pressure against the surrounding fluid, propelling the cell through the medium. Certain eukaryotic cells, including many protists and sperm cells, also possess flagella, but the eukaryotic versions are much more complex than the simple protein filaments of bacteria (Figure 1.14b), and they generate movement by a different mechanism. In the preceding paragraphs, many of the most important differences between the prokaryotic and eukaryotic levels of cellular organization were mentioned. We will elaborate on many of these points in later chapters. Before you dismiss prokaryotes as inferior, keep in mind that these organisms have remained on Earth for more than three billion years, and at this very moment, trillions of them are clinging to the outer

13

Recipient bacterium Flagella

F pilus

Donor bacterium

(a)

Figure 1.13 Bacterial conjugation. Electron micrograph showing a conjugating pair of bacteria joined by a structure of the donor cell, termed the F pilus, through which DNA is thought to be passed. (COURTESY OF CHARLES C. BRINTON, JR., AND JUDITH CARNAHAN.)

(b)

Figure 1.14 The difference between prokaryotic and eukaryotic flagella. (a) The bacterium Salmonella with its numerous flagella. Inset shows a high-magnification view of a portion of a single bacterial flagellum, which consists largely of a single protein called flagellin. (b) Each of these human sperm cells is powered by the undulatory movements of a single flagellum. The inset shows a cross section of the central core of a mammalian sperm flagellum. The flagella of eukaryotic cells are so similar that this cross section could just as well have been taken of a flagellum from a protist or green alga. (A: FROM BERNARD R. GERBER, LEWIS M. ROUTLEDGE, AND SHIRO TAKASHIMA, J. MOL. BIOL. 71:322, © 1972, WITH PERMISSION FROM ELSEVIER. INSET COURTESY OF JULIUS ADLER AND M. L. DEPAMPHILIS; B: JUERGEN BERGER/PHOTO RESEARCHERS, INC.; INSET: DON W. FAWCETT/PHOTO RESEARCHERS, INC.)

1.3 Two Fundamentally Different Classes of Cells

surface of your body and feasting on the nutrients within your digestive tract. We think of these organisms as individual, solitary creatures, but recent insights have shown that they live in complex, multispecies communities called biofilms. The layer of plaque that grows on our teeth is an example of a biofilm. Different cells in a biofilm may carry out different specialized activities, not unlike the cells in a plant or an animal. Consider also that, metabolically, prokaryotes are very sophisticated, highly evolved organisms. For example, a bacterium, such as Escherichia coli, a common inhabitant of both the human digestive tract and the laboratory culture dish, has the ability to live and prosper in a medium containing one or two low-molecular-weight organic compounds and a few inorganic ions. Other bacteria are able to live on a diet consisting solely of inorganic substances. One species of bacteria has been found in wells more than a thousand meters below the Earth’s surface living on basalt rock and molecular hydrogen (H2) produced by inorganic reactions. In contrast, even the most metabolically talented cells in your body require a variety of organic compounds, including a number of vitamins and other essential substances they cannot make on their own. In fact, many of these essential dietary ingredients are produced by the bacteria that normally live in the large intestine.

14

Chapter 1 Introduction to the Study of Cell and Molecular Biology

Types of Prokaryotic Cells The distinction between prokaryotic and eukaryotic cells is based on structural complexity (as detailed in Table 1.1) and not on phylogenetic relationship. Prokaryotes are divided into two major taxonomic groups, or domains: the Archaea (or archaebacteria) and the Bacteria (or eubacteria). Members of the Archaea are more closely related to eukaryotes than they are to the other group of prokaryotes (the Bacteria). The experiments that led to the discovery that life is represented by three distinct branches are discussed in the Experimental Pathways at the end of the chapter. The domain Archaea includes several groups of organisms whose evolutionary ties to one another are revealed by similarities in the nucleotide sequences of their nucleic acids. The best known Archaea are species that live in extremely inhospitable environments; they are often referred to as “extremophiles.” Included among the Archaea are the methanogens [prokaryotes capable of converting CO2 and H2 gases into methane (CH4) gas]; the halophiles (prokaryotes that live in extremely salty environments, such as the Dead Sea or certain deep sea brine pools that possess a salinity equivalent to 5M MgCl2); acidophiles (acid-loving prokaryotes that thrive at a pH as low as 0, such as that found in the drainage fluids of abandoned mine shafts); and thermophiles (prokaryotes that live at very high temperatures). Included in this last-named group are hyperthermophiles, which live in the hydrothermal vents of the ocean floor. The latest record holder among this group has been named “strain 121” because it is able to grow and divide in superheated water at a temperature of 121⬚C, which just happens to be the temperature used to sterilize surgical instruments in an autoclave. Recent analyses of soil and ocean microbes indicate that many members of the Archaea are also at home in habitats of normal temperature, pH, and salinity. All other prokaryotes are classified in the domain Bacteria. This domain includes the smallest known cells, the mycoplasma (0.2 ␮m diameter), which are the only known prokaryotes to lack a cell wall and to contain a genome with

(a)

Figure 1.15 Cyanobacteria. (a) Electron micrograph of a cyanobacterium showing the cytoplasmic membranes that carry out photosynthesis. These concentric membranes are very similar to the thylakoid membranes present within the chloroplasts of plant cells, a reminder

fewer than 500 genes. Bacteria are present in every conceivable habitat on Earth, from the permanent ice shelf of the Antarctic to the driest African deserts, to the internal confines of plants and animals. Bacteria have even been found living in rock layers situated several kilometers beneath the Earth’s surface. Some of these bacterial communities are thought to have been cut off from life on the surface for more than one hundred million years. The most complex prokaryotes are the cyanobacteria. Cyanobacteria contain elaborate arrays of cytoplasmic membranes, which serve as sites of photosynthesis (Figure 1.15a). The membranes of cyanobacteria are very similar to the photosynthetic membranes present within the chloroplasts of plant cells. As in eukaryotic plants, photosynthesis in cyanobacteria is accomplished by splitting water molecules, which releases molecular oxygen. Many cyanobacteria are capable not only of photosynthesis, but also of nitrogen fixation, the conversion of nitrogen (N2) gas into reduced forms of nitrogen (such as ammonia, NH3) that can be used by cells in the synthesis of nitrogencontaining organic compounds, including amino acids and nucleotides. Those species capable of both photosynthesis and nitrogen fixation can survive on the barest of resources—light, N2, CO2, and H2O. It is not surprising, therefore, that cyanobacteria are usually the first organisms to colonize the bare rocks rendered lifeless by a scorching volcanic eruption. Another unusual habitat occupied by cyanobacteria is illustrated in Figure 1.15b. Prokaryotic Diversity For the most part, microbiologists are familiar only with those microorganisms they are able to grow in a culture medium. When a patient suffering from a respiratory or urinary tract infection sees his or her physician, one of the first steps often taken is to culture the pathogen. Once it has been cultured, the organism can be identified and the proper treatment prescribed. It has proven relatively easy to culture most disease-causing prokaryotes, but the same is not true for those living free in nature. The problem is compounded by the fact that prokaryotes are barely visible in a light microscope and their morphology is often not very distinctive. To date, roughly 6000 species of prokaryotes have

(b)

that chloroplasts evolved from a symbiotic cyanobacterium. (b) Cyanobacteria living inside the hairs of these polar bears are responsible for the unusual greenish color of their coats. (A: COURTESY OF NORMA J. LANG; B: COURTESY ZOOLOGICAL SOCIETY OF SAN DIEGO.)

15

encoded by these microbial genomes are the synthesis of vitamins, the breakdown of complex plant sugars, and the prevention of growth of pathogenic organisms. By using sequence-based molecular techniques, biologists have found that most habitats on Earth are teeming with previously unrecognized prokaryotic life. One estimate of the sheer numbers of prokaryotes in the major habitats of the Earth is given in Table 1.2. It is noteworthy that more than 90 percent of these organisms are now thought to live in the subsurface sediments well beneath the oceans and upper soil layers. Nutrients can be so scarce in some of these deep sediments that microbes living there are thought to divide only once every several hundred years! Table 1.2 also provides an estimate of the amount of carbon that is sequestered in the world’s prokaryotic cells. To put this number into more familiar terms, it is roughly comparable to the total amount of carbon present in all of the world’s plant life.

Types of Eukaryotic Cells: Cell Specialization In many regards, the most complex eukaryotic cells are not found inside of plants or animals, but rather among the singlecelled (unicellular) protists, such as those pictured in Figure 1.16. All of the machinery required for the complex activities

Table 1.2 Number and Biomass of Prokaryotes in the World Environment

Aquatic habitats Oceanic subsurface Soil Terrestrial subsurface Total

No. of prokaryotic cells, ⴛ 1028

Pg of C in prokaryotes*

12 355 26 25–250 415–640

2.2 303 26 22–215 353–546

*1 Petagram (Pg) ⫽ 1015 g. Source: W. B. Whitman et al., Proc. Nat’l. Acad. Sci. U.S.A. 95:6581, 1998.

Figure 1.16 Vorticella, a complex ciliated protist. A number of these unicellular organisms are seen here; most have withdrawn their “heads” due to shortening of the blue-stained contractile ribbon in the stalk. Each cell has a single large nucleus, called a macronucleus (arrow), which contains many copies of the genes. (CAROLINA BIOLOGICAL SUPPLY CO./PHOTOTAKE.)

1.3 Two Fundamentally Different Classes of Cells

been identified by traditional techniques, which is less than one-tenth of 1 percent of the millions of prokaryotic species thought to exist on Earth! Our appreciation for the diversity of prokaryotic communities has increased dramatically in recent years with the use of molecular techniques that do not require the isolation of a particular organism. Suppose one wanted to learn about the diversity of prokaryotes that live in the upper layers of the Pacific Ocean off the coast of California. Rather than trying to culture such organisms, which would prove largely futile, a researcher could concentrate the cells from a sample of ocean water, extract the DNA, and analyze certain DNA sequences present in the preparation. All organisms share certain genes, such as those that code for the RNAs present in ribosomes or the enzymes of certain metabolic pathways. Even though all organisms may share such genes, the sequences of the nucleotides that make up the genes vary considerably from one species to another. This is the basis of biological evolution. By using techniques that reveal the variety of DNA sequences of a particular gene in a particular habitat, one learns directly about the diversity of species that live in that habitat. Recent sequencing techniques have become so rapid and cost-efficient that virtually all of the genes present in the microbes of a given habitat can be sequenced, generating a collective genome, or metagenome. This approach can provide information about the types of proteins these organisms manufacture and thus about many of the metabolic activities in which they engage. These same molecular strategies are being used to explore the remarkable diversity among the trillions of “unseen passengers” that live on or within our own bodies, in habitats such as the intestinal tract, mouth, vagina, and skin. This collection of microbes, which is known as the human microbiome, is the subject of several international research efforts aimed at identifying and characterizing these organisms in people of different age, diet, geography, and state of health. It has already been demonstrated, for example, that obese and lean humans have markedly different populations of bacteria in their digestive tracts. As obese individuals lose weight, their bacterial profile shifts toward that of the leaner individuals. One recent study of fecal samples taken from 124 people of varying weight revealed the presence within the collective population of more than 1000 different species of bacteria. Taken together, these microbes contained more than 3 million distinct genes—approximately 150 times as many as the number present in the human genome. Among the functions of proteins

16

in which this organism engages—sensing the environment, trapping food, expelling excess fluid, evading predators—is housed within the confines of a single cell. Complex unicellular organisms represent one evolutionary pathway. An alternate pathway has led to the evolution of multicellular organisms in which different activities are conducted by different types of specialized cells. Specialized cells are formed by a process called differentiation. A fertilized human egg, for example, will progress through a course of embryonic development that leads to the formation of approximately 250 distinct types of differentiated cells. Some cells become part of a particular digestive gland, others part of a large skeletal muscle, others part of a bone, and so forth (Figure 1.17). The pathway of differentiation followed by each embryonic cell depends primarily on the signals it receives from the surrounding environment; these signals in turn depend on the position of that cell within the embryo. As discussed in the accompanying Human Perspective, researchers

are learning how to control the process of differentiation in the culture dish and applying this knowledge to the treatment of complex human diseases. As a result of differentiation, different types of cells acquire a distinctive appearance and contain unique materials. Skeletal muscle cells contain a network of precisely aligned filaments composed of unique contractile proteins; cartilage cells become surrounded by a characteristic matrix containing polysaccharides and the protein collagen, which together provide mechanical support; red blood cells become diskshaped sacks filled with a single protein, hemoglobin, which transports oxygen; and so forth. Despite their many differences, the various cells of a multicellular plant or animal are composed of similar organelles. Mitochondria, for example, are found in essentially all types of cells. In one type, however, they may have a rounded shape, whereas in another they may be highly elongated and thread-like. In each case, the number, appearance, and location of the various organelles can be

Bundle of nerve cells

Chapter 1 Introduction to the Study of Cell and Molecular Biology

Loose connective tissue with fibroblasts

Red blood cells

Smooth muscle cells

Bone tissue with osteocytes

Fat (adipose) cells

Striated muscle cells

Intestinal epithelial cells

Figure 1.17 Pathways of cell differentiation. A few of the types of differentiated cells present in a human fetus. (MICROGRAPHS COURTESY OF MICHAEL ROSS, UNIVERSITY OF FLORIDA.)

17

correlated with the activities of the particular cell type. An analogy might be made to a variety of orchestral pieces: all are composed of the same notes, but varying arrangement gives each its unique character and beauty. Model Organisms Living organisms are highly diverse, and the results obtained from a particular experimental analysis may depend on the particular organism being studied. As a result, cell and molecular biologists have focused considerable research activities on a small number of “representative” or model organisms. It is hoped that a comprehensive body of knowledge built on these studies will provide a framework to understand those basic processes that are shared by most organisms, especially humans. This is not to suggest that many other organisms are not widely used in the study of cell and molecular biology. Nevertheless, six model organisms—one prokaryote and five eukaryotes—have captured much of the attention: a bacterium, E. coli; a budding yeast, Saccharomyces cerevisiae; a flowering plant, Arabidopsis thaliana; a nematode, Caenorhabditis elegans; a fruit fly, Drosophila melanogaster; and a mouse, Mus musculus. Each of these organisms has specific advantages that make it particularly useful as a research subject for answering certain types of questions. Each of these organisms is pictured in Figure 1.18, and a few of their advantages as research systems are described in the accompanying legend. We will concentrate in this text on results obtained from studies on mammalian systems—mostly on the mouse and on cultured mammalian cells—because these findings are most applicable to humans. Even so, we will have many occasions to describe research carried out on the cells of other species. You may be surprised to discover how similar you are at the cell and molecular level to these much smaller and simpler organisms.

length. Prokaryotic cells typically range in length from about 1 to 5 ␮m, eukaryotic cells from about 10 to 30 ␮m. There are a number of reasons most cells are so small. Consider the following. ■

■

■

The Sizes of Cells and Their Components

Synthetic Biology A goal of one field of biological research, often referred to as synthetic biology, is to create some minimal type of living cell in the laboratory, essentially from “scratch,” as suggested by the cartoon in Figure 1.20. One motivation of these researchers is simply to accomplish the feat and, in the process, demonstrate that life at the cellular level emerges spontaneously when the proper constituents are brought together from chemically synthesized materials. At this point in time, biologists are nowhere near accomplishing this feat, and many members of society would argue that it would be unethical to do so. A more modest goal of synthetic biology is to develop novel life forms, using existing organisms as a starting point, that have a unique value in medicine and industry, or in cleaning up the environment. 3

You can verify this statement by calculating the surface area and volume of a cube whose sides are 1 cm in length versus a cube whose sides are 10 cm in length. The surface area/volume ratio of the smaller cube is considerably greater than that of the larger cube.

1.3 Two Fundamentally Different Classes of Cells

Figure 1.19 shows the relative size of a number of structures of interest in cell biology. Two units of linear measure are most commonly used to describe structures within a cell: the micrometer (␮m) and the nanometer (nm). One ␮m is equal to 10⫺6 meters, and one nm is equal to 10⫺9 meters. The angstrom (Å), which is equal to one-tenth of a nm, is commonly employed by molecular biologists for atomic dimensions. One angstrom is roughly equivalent to the diameter of a hydrogen atom. Large biological molecules (i.e., macromolecules) are described in either angstroms or nanometers. Myoglobin, a typical globular protein, is approximately 4.5 nm ⫻ 3.5 nm ⫻ 2.5 nm; highly elongated proteins (such as collagen or myosin) are over 100 nm in length; and DNA is approximately 2.0 nm in width. Complexes of macromolecules, such as ribosomes, microtubules, and microfilaments, are between 5 and 25 nm in diameter. Despite their tiny dimensions, these macromolecular complexes constitute remarkably sophisticated “nanomachines” capable of performing a diverse array of mechanical, chemical, and electrical activities. Cells and their organelles are more easily defined in micrometers. Nuclei, for example, are approximately 5–10 ␮m in diameter, and mitochondria are approximately 2 ␮m in

Most eukaryotic cells possess a single nucleus that contains only two copies of most genes. Because genes serve as templates for the production of information-carrying messenger RNAs, a cell can only produce a limited number of these messenger RNAs in a given amount of time. The greater a cell’s cytoplasmic volume, the longer it will take to synthesize the number of messages required by that cell. As a cell increases in size, the surface area/volume ratio decreases.3 The ability of a cell to exchange substances with its environment is proportional to its surface area. If a cell were to grow beyond a certain size, its surface would not be sufficient to take up the substances (e.g., oxygen, nutrients) needed to support its metabolic activities. Cells that are specialized for absorption of solutes, such as those of the intestinal epithelium, typically possess microvilli, which greatly increase the surface area available for exchange (see Figure 1.3). The interior of a large plant cell is typically filled by a large, fluid-filled vacuole rather than metabolically active cytoplasm (see Figure 8.36b). A cell depends to a large degree on the random movement of molecules (diffusion). Oxygen, for example, must diffuse from the cell’s surface through the cytoplasm to the interior of its mitochondria. The time required for diffusion is proportional to the square of the distance to be traversed. For example, O2 requires only 100 microseconds to diffuse a distance of 1 ␮m, but requires 106 times as long to diffuse a distance of 1 mm. As a cell becomes larger and the distance from the surface to the interior becomes greater, the time required for diffusion to move substances in and out of a metabolically active cell becomes prohibitively long.

18

(b)

(a)

Chapter 1 Introduction to the Study of Cell and Molecular Biology

(d)

(e)

(c)

(f)

Figure 1.18 Six model organisms. (a) Escherichia coli is a rod-shaped bacterium that lives in the digestive tract of humans and other mammals. Much of what we will discuss about the basic molecular biology of the cell, including the mechanisms of replication, transcription, and translation, was originally worked out on this one prokaryotic organism. The relatively simple organization of a prokaryotic cell is illustrated in this electron micrograph (compare to part b of a eukaryotic cell). (b) Saccharomyces cerevisiae, more commonly known as baker’s yeast or brewer’s yeast. It is the least complex of the eukaryotes commonly studied, yet it contains a surprising number of proteins that are homologous to proteins in human cells. Such proteins typically have a conserved function in the two organisms. The species has a small genome encoding about 6200 proteins; it can be grown in a haploid state (one copy of each gene per cell rather than two as in most eukaryotic cells); and it can be grown under either aerobic (O2-containing) or anaerobic (O2-lacking) conditions. It is ideal for the identification of genes through the use of mutants. (c) Arabidopsis thaliana, a weed (called the thale cress) that is related to mustard and cabbage, which has an unusually small genome (120 million base pairs) for a flowering plant, a rapid generation time, and large seed production, and it grows to a height of only a few inches. (d ) Caenorhabditis elegans, a microscopic-sized nematode, consists of a defined number of cells (roughly 1000), each of which develops according to a precise pattern of cell divisions. The animal is easily cultured, can be kept alive in a

frozen state, has a transparent body wall, a short generation time, and facility for genetic analysis. This micrograph shows the larval nervous system, which has been labeled with the green fluorescent protein (GFP). The 2002 Nobel Prize was awarded to the researchers who pioneered its study. (e) Drosophila melanogaster, the fruit fly, is a small but complex eukaryote that is readily cultured in the lab, where it grows from an egg to an adult in a matter of days. Drosophila has been a favored animal for the study of genetics, the molecular biology of development, and the neurological basis of simple behavior. Certain larval cells have giant chromosomes, whose individual genes can be identified for studies of evolution and gene expression. In the mutant fly shown here, a leg has developed where an antenna would be located in a normal (wild type) fly. ( f ) Mus musculus, the common house mouse, is easily kept and bred in the laboratory. Thousands of different genetic strains have been developed, many of which are stored simply as frozen embryos due to lack of space to house the adult animals. The “nude mouse” pictured here develops without a thymus gland and, therefore, is able to accept human tissue grafts that are not rejected. (A&B: BIOPHOTO ASSOCIATES/PHOTO RESEARCHERS; C: JEAN CLAUDE REVY/PHOTOTAKE; D: COURTESY OF ERIK JORGENSEN, UNIVERSITY OF U TAH. FROM TRENDS GENETICS, VOL. 14, COVER #12, 1998, WITH PERMISSION FROM ELSEVIER. E: DAVID SCHARF/ PHOTO RESEARCHERS, INC. F: TED SPIEGEL/© CORBIS IMAGES.)

If, as most biologists would argue, the properties and activities of a cell spring from the genetic blueprint of that cell, then it should be possible to create a new type of cell by introducing a new genetic blueprint into the cytoplasm of an exist-

ing cell. This feat was accomplished by J. Craig Venter and colleagues in 2007, when they replaced the genome of one bacterium with a genome isolated from a closely related species, effectively transforming one species into the other. By

19 1A

Hydrogen atom

Water molecule (4 A diameter)

(1A diameter)

1nm

DNA molecule (2 nm wide)

Electron microscope

Myoglobin (4.5 nm diameter)

Lipid bilayer (5 nm wide)

Actin filament (6 nm diameter)

10 nm

Ribosome (30 nm diameter)

HIV

100 nm

(100 nm diameter)

Cilium

Standard light microscope

(250 nm diameter)

1 m

Figure 1.20 The synthetic biologist’s toolkit of the future? Such a toolkit would presumably contain nucleic acids, proteins, lipids, and many other types of biomolecules. (COURTESY OF JAKOB C. SCHWEIZER.)

Bacterium (1 m long)

Mitochondrion (2 m long)

Chloroplast (8 m diameter)

10 m

Lymphocyte (12 m diameter)

Epithelial cell (30 m height)

100 m

Human vision

Paramecium

1mm

(1.5 mm long)

Frog egg (2.5 mm diameter)

Figure 1.19 Relative sizes of cells and cell components. These structures differ in size by more than seven orders of magnitude.

REVIEW 1. Compare a prokaryotic and eukaryotic cell on the basis of structural, functional, and metabolic differences. 2. What is the importance of cell differentiation? 3. Why are cells almost always microscopic? 4. If a mitochondrion were 2 ␮m in length, how many angstroms would it be? How many nanometers? How many millimeters?

1.3 Two Fundamentally Different Classes of Cells

2010, after overcoming a number of stubborn technical roadblocks, the team was able to accomplish a similar feat using a copy of a bacterial genome that had been assembled (inside of a yeast cell) from fragments of DNA that had been chemically synthesized in the laboratory. The synthetic copy of the donor genome, which totaled approximately 1.1 million base pairs of DNA, contained a number of modifications introduced by the researchers. The modified copy of the genome (from M. mycoides) was transplanted into a cell of a closely related bacterial species (M. capricolum), where it replaced the host’s original genome. Following genome transplantation, the recipient cell rapidly took on the characteristics of the species from which the donor DNA has been derived. In effect, these researchers have produced cells containing a “genetic skeleton” to which they can add combinations of new genes taken from other organisms.

Researchers around the world are attempting to genetically engineer organisms to possess metabolic pathways capable of producing pharmaceuticals, hydrocarbon-based fuel molecules, and other useful chemicals from cheap, simple precursors. At least one company has claimed to be growing genetically engineered cyanobacteria capable of producing diesel fuel from sunlight, water, and CO2. Researchers at another company have genetically engineered the common lab bacterium E. coli to ferment the complex polysaccharides present in seaweed into the biofuel ethanol. This feat required the introduction into E. coli of a combination of genes derived from three other bacterial species. Work has also begun on “rewriting” the yeast genome, signifying that eukaryotic cells have also become part of the effort to design genetically engineered biological manufacturing plants. In principle, the work described in the Human Perspective, in which one type of cell is directed into the formation of an entirely different type of cell, is also a form of synthetic biology. As a result of these many efforts, biologists are no longer restricted to studying cells that are available in Nature, but can also turn their attention to cells that can become available through experimental manipulation.

20

T H E

H U M A N

P E R S P E C T I V E

Chapter 1 Introduction to the Study of Cell and Molecular Biology

The Prospect of Cell Replacement Therapy Many human diseases result from the deaths of specific types of cells. Type 1 diabetes, for example, results from the destruction of beta cells in the pancreas; Parkinson’s disease occurs with the loss of dopamine-producing neurons in the brain; and heart failure can be traced to the death of cadiac muscle cells (cardiomyocytes) in the heart. Imagine the possibilities if we could isolate cells from a patient, convert them into the cells that are needed by that patient, and then infuse them back into the patient to restore the body’s lost function. Recent studies have given researchers hope that one day this type of therapy will be commonplace. To better understand the concept of cell replacement therapy, we can consider a procedure used widely in current practice known as bone marrow transplantation in which cells are extracted from the pelvic bones of a donor and infused into the body of a recipient. Bone marrow transplantation is used most often to treat lymphomas and leukemias, which are cancers that affect the nature and number of white blood cells. To carry out the procedure, the patient is exposed to a high level of radiation and/or toxic chemicals, which kills the cancer cells, but also kills all of the cells involved in the formation of red and white blood cells. This treatment has this effect because blood-forming cells are particularly sensitive to radiation and toxic chemicals. Once a person’s blood-forming cells have been destroyed, they are replaced by bone marrow cells transplanted from a healthy donor. Bone marrow can regenerate the blood tissue of the transplant recipient because it contains a small percentage of cells that can proliferate and restock the patient’s blood-forming bone marrow tissue.1 These blood-forming cells in the bone marrow are termed hematopoietic stem cells (or HSCs), and they were discovered in the early 1960s by Ernest McCulloch and James Till at the University of Toronto. HSCs are responsible for replacing the millions of red and white blood cells that age and die every minute in our bodies (see Figure 17.6). Amazingly, a single HSC is capable of reconstituting the entire hematopoietic (blood-forming) system of an irradiated mouse. An increasing number of parents are saving the blood from the umbilical cord of their newborn baby as a type of “stem-cell insurance policy” in case that child should ever develop a disease that might be treated by administration of HSCs. Now that we have described one type of cell replacement therapy, we can consider several other types that have a much wider therapeutic potential. We will divide these potential therapies into four types.

Adult Stem Cells Hematopoietic stem cells in the bone marrow are an example of an adult stem cell. Stem cells are defined as undifferentiated cells that (1) are capable of self-renewal, that is, production of more cells like themselves, and (2) are multipotent, that is, are capable of differentiating into two or more mature cell types. HSCs of the bone marrow are only one type of adult stem cell. Most, if not all, of the organs in a human adult contain stem cells that are capable of replacing the particular cells of the tissue in which they are found. Even the adult brain, which is not known for its ability to regenerate, contains stem cells that can generate new neurons and glial cells (the supportive cells of the brain). Figure 1a shows an isolated stem cell present in adult skeletal muscle; 1

Bone marrow transplantation can be contrasted to a simple blood transfusion where the recipient receives differentiated blood cells (especially red blood cells and platelets) present in the circulation.

(a)

(b)

Figure 1 An adult muscle stem cell. (a) A portion of a muscle fiber, with its many nuclei stained blue. A single stem cell (yellow) is seen to be lodged between the outer surface of the muscle fiber and an extracellular layer (or basement membrane), which is stained red. The undifferentiated stem cell exhibits this yellow color because it expresses a protein that is not present in the differentiated muscle fiber. (b) Adult stem cells undergoing differentiation into adipose (fat) cells in culture. Stem cells capable of this process are present in adult fat tissue and also bone marrow. (A: FROM CHARLOTTE A. COLLINS; ET AL., CELL 122:291, 2005; BY PERMISSION OF ELSEVIER; B: COURTESY OF THERMO FISHER SCIENTIFIC, FROM NATURE 451:855, 2008.)

these “satellite cells,” as they are called, are thought to divide and differentiate as needed for the repair of injured muscle tissue. Figure 1b shows a culture of adipose (fat) cells that have differentiated in vitro from adult stem cells that are present within fat tissue. The adult human heart contains stem cells that are capable of differentiating into the cells that form both the muscle tissue of the heart (the cardiomyocytes of the myocardium) and the heart’s blood vessels. It had been hoped that these cardiac stem cells might have the potential to regenerate healthy heart tissue in a patient who had experienced a serious heart attack. This hope has apparently been realized based on the appearance of two landmark reports in late 2011 on the results from clinical trials of patients that had suffered significant heart-tissue damage following heart attacks. Stem cells were harvested from each of the patients during heart surgeries, expanded in number through in vitro culture, and then infused back into each patient’s heart. Over the next few months, a majority of treated patients experienced significant replacement (e.g., 50 percent) of the damaged heart muscle by healthy tissue derived from the infused stem cells. This regeneration of heart tissue was accompanied by a clear improvement in quality of life compared to patients in the placebo group that did not receive stem cells. Adult stem cells are an ideal system for cell replacement therapies because they represent an autologous treatment; that is, the cells are taken from the same patient in which they are used. Consequently, these stem cells do not face the prospect of immune rejection. At the same time, however, adult stem cells are very scarce within the tissue in question, often difficult to isolate and work with, and are only likely to replace cells from the same tissue from which they are taken. These dramatic results with cardiac stem cells may have rekindled interest in adult stem cells, which had waned after a number of failed attempts to direct stem cells isolated from bone marrow to regenerate diseased tissues.

21 Embryonic Stem Cells Much of the excitement that has been generated in the field over the past decade or two has come from studies on embryonic stem (ES) cells, which are a type of stem cell isolated from very young mammalian embryos (Figure 2a). These are the cells in the early embryo that give rise to all of the various structures of the mammalian fetus. Unlike adult stem cells, ES cells are pluripotent; that is, they are capable of differentiating into every type of cell in the body. In most cases, human ES cells have been isolated from embryos provided by in vitro fertilization clinics. Worldwide, dozens of genetically distinct human ES cell lines, each derived from a single embryo, are available for experimental investigation. The long-range goal of clinical researchers is to learn how to coax ES cells to differentiate in culture into each of the many cell types that might be used for cell replacement therapy. Considerable progress has been made in this pursuit, and numerous studies have shown that transplants of differentiated, ES-derived cells can improve the condition of animals with diseased or damaged organs. The first trials in humans were begun in 2009 on patients who had experienced debilitating spinal cord injuries or were suffering from an eye disease called Stargardt’s macular dystrophy. The trials to treat spinal cord injuries utilize cells, called oligodendrocytes, that produce the myelin sheaths that become wrapped around nerve cells (see Figure 4.5). The oligodendrocytes used in these trials were differentiated from human ES cells that were cultured in a medium containing insulin, thyroid hormone, and a combination of certain growth factors. This particular culture protocol had been found to direct the differentiation of ES cells into oligodendrocytes rather than any other cell type. At the time of this writing, no significant improvement had been reported in any of the treated patients, and the company conducting the trial has decided to cease further involvement in the effort. The primary risk with the therapeutic use of ES cells is the unnoticed presence of undifferentiated ES cells among the differentiated cell population. Undifferentiated ES

cells are capable of forming a type of benign tumor, called a teratoma, which may contain a bizarre mass of various differentiated tissues, including hair and teeth. The formation of a teratoma within the central nervous system could have severe consequences. In addition, the culture of ES cells at the present time involves the use of nonhuman biological materials, which also poses potential risks. The ES cells used in these early trials were derived from cell lines that had been isolated from human embryos unrelated to the patients who are being treated. Such cells face the prospect of immunologic rejection by the transplant recipient. It may be possible, however, to “customize” ES cells so that they possess the same genetic makeup of the individual who is being treated. This may be accomplished one day by a roundabout procedure called somatic cell nuclear transfer (SCNT), shown in Figure 2b, that begins with an unfertilized egg—a cell that is obtained from the ovaries of an unrelated woman donor. In this approach, the nucleus of the unfertilized egg would be replaced by the nucleus of a cell from the patient to be treated, which would cause the egg to have the same chromosome composition as that of the patient. The egg would then be allowed to develop to an early embryonic stage, and the ES cells would be removed, cultured, and induced to differentiate into the type of cells needed by the patient. Because this procedure involves the formation of a human embryo that is used only as a source of ES cells, there are major ethical questions that must be settled before it could be routinely practiced. In addition, the process of SCNT is so expensive and technically demanding that it is highly improbable that it could ever be practiced as part of any routine medical treatment. It is more

Somatic cell

Enucleated oocyte

Nucleated oocyte

Patient

Remove somatic cells

Transplant required differentiated cells back into patient

ES cells

Grow ES cells in culture Muscle cells Induce ES cells to differentiate

Blood cells Liver cells Nerve cells (a)

Figure 2 Embryonic stem cells; their isolation and potential use. (a) Micrograph of a mammalian blastocyst, an early stage during embryonic development, showing the inner cell mass, which is composed of pluripotent ES cells. Once isolated, such cells are readily grown in culture. (b) A potential procedure for obtaining differentiated cells for use in cell replacement therapy. A small piece of tissue is taken from the patient, and one of the somatic cells is fused with a donor oocyte whose own nucleus had been previously removed. The resulting oocyte (egg), with the patient’s cell nucleus, is allowed to develop into an early embryo, and the ES cells are harvested and grown in culture. A population of ES cells are induced to differentiate into the required cells,

(b)

which are subsequently transplanted into the patient to restore organ function. (At the present time, it has not been possible to obtain blastocyst stage embryos, that is, ones with ES cells, from any primate species by the procedure shown here, although it has been accomplished using an oocyte from which the nucleus is not first removed. The ES cells that are generated in such experiments are triploid; that is, they have three copies of each chromosome—one from the oocyte and two from the donor nucleus—rather than two, as would normally be the case. Regardless, these triploid ES cells are pluripotent and capable of transplantation.) (A: © PHANIE/SUPERSTOCK)

1.3 Two Fundamentally Different Classes of Cells

ES cells

Allow to develop to blastocyst

Fuse somatic cell with enucleated oocyte

22 likely that, if ES cell-based therapy is ever practiced, it would depend on the use of a bank of hundreds or thousands of different ES cells. Such a bank could contain cells that are close enough as a tissue match to be suitable for use in the majority of patients.

Chapter 1 Introduction to the Study of Cell and Molecular Biology

Induced Pluripotent Stem Cells It had long been thought that the process of cell differentiation in mammals was irreversible; once a cell had become a fibroblast, or white blood cell, or cartilage cell, it could never again revert to any other cell type. This concept was shattered in 2006 when Shinya Yamanaka and co-workers of Kyoto University announced a stunning discovery; his lab had succeeded in reprogramming a fully differentiated mouse cell—in this case a type of connective tissue fibroblast—into a pluripotent stem cell. They accomplished the feat by introducing, into the mouse fibroblast, the genes that encoded four key proteins that are characteristic of ES cells. These genes (Oct4, Sox2, Klf4, and Myc, known collectively as OSKM) are thought to play a key role in maintaining the cells in an undifferentiated state and allowing them to continue to self-renew. The genes were introduced into cultured fibroblasts using gene-carrying viruses, and those rare cells that became reprogrammed were selected from the others in the culture by specialized techniques. They called this new type of cells induced pluripotent cells (iPS cells) and demonstrated that they were indeed pluripotent by injecting them into a mouse blastocyst and finding that they participated in the differentiation of all the cells of the body, including eggs and sperm. Within the next year or so the same reprogramming feat had been accomplished in several labs with human cells. What this means is that researchers now have available to them an unlimited supply of pluripotent cells that can be directed to differentiate into various types of body cells using similar experimental protocols to those already developed for ES cells. Indeed, iPS cells have already been used to correct certain disease conditions in experimental animals, including sickle cell anemia in mice as depicted in Figure 3. iPS cells have also been prepared from adult cells taken from patients with a multitude of genetic disorders. Researchers are then able to follow the differentiation of these iPS cells in culture into the specialized cell types that are affected by the particular disease. An example of this type of experiment is portrayed on the front cover of this book. It is hoped that such studies will reveal the mechanisms of disease formation as it unfolds in a culture dish just as it would normally occur in an unobservFigure 3 Steps taken to generate induced pluripotent stem (iPS) cells for use in correcting the inherited disease sickle cell anemia in mice. Skin cells are collected from the diseased animal, reprogrammed in culture by introducing the four required genes that are ferried into the cells by viruses, and allowed to develop into undifferentiated pluripotent iPS cells. The iPS cells are then treated so as to replace the defective (globin) gene with a normal copy, and the corrected iPS cells are caused to differentiate into normal blood stem cells in culture. These blood stem cells are then injected back into the diseased mouse, where they proliferate and differentiate into normal blood cells, thereby curing the disorder. (REPRINTED FROM AN ILLUSTRATION BY RUDOLF JAENISCH, CELL 132:5, 2008, WITH PERMISSION FROM ELSEVIER.)

able way deep within the body. These “diseased iPS cells” have been referred to as “patients in a Petri dish.” The clinical relevance of these cells can be illustrated by an example. iPS cells derived from patients with a heart disorder called long QT syndrome differentiate into cardiac muscle cells that exhibit irregular contractions (“beats”) in culture. This disease-specific phenotype seen in culture can be corrected by several medicines normally prescribed to treat this disorder. Moreover, when these cardiomyocytes that had differentiated from the diseased iPS cells were exposed to the drug cisapride, the irregularity of their contractions increased. Cisapride is a drug that was used to treat heartburn before it was pulled from the market in the United States after it was shown to cause heart arrhythmias in certain patients. Results of this type suggest that differentiated cells derived from diseased iPS cells will serve as valuable targets for screening potential drugs for their effectiveness in halting disease progression. Unlike ES cells, the generation of iPS cells does not require the use of an embryo. This feature removes all of the ethical reservations that accompany work with ES cells and also makes it much easier to generate these cells in the lab. However, as research on iPS cells has increased, the therapeutic potential for these cells has become less clear. For the first several years of study, it was thought that iPS cells and ES cells were essentially indistinguishable. Recent studies, however, have shown that iPS cells lack the “high quality” characteristic of ES cells and that not all iPS cells are the same. For example, iPS cells exhibit certain genomic abnormalities that are not present in ES cells, including the presence of mutations and extra copies of random segments of the genome. In addition, the DNA-containing chromatin of iPS cells retains certain traces of the original cells from which they were derived, which means that they are not completely reprogrammed into ES-like, pluripotent cells. This residual memory of their origin makes it is easier to direct iPS cells toward differentiation back into the cells from which they were derived than into other types of cells. It may be that these apparent deficiencies in iPS cells will not be a serious impediment in the use of these cells to treat diseases that affect adult tissues, but it has raised important questions. There are other issues with iPS cells as well. It will be important to develop efficient cell reprogramming techniques that do not use genome-integrating viruses because such cells carry the potential of developing into cancers. Progress has been made in this regard, but the efficiency of iPS cell formation typically drops when other procedures are used to introduce genes. Like ES cells, undifferentiated iPS cells also give rise to teratomas, so it is essential that only fully differentiated cells are transplanted into

Transplant

Mouse with sickle cell anemia

Differentiate into blood stem cells

Genetically corrected iPS cells

Collect skin cells

Recovered mouse DNA with mutant gene (yellow)

DNA with normal gene (red)

CORRECT SICKLE CELL MUTATION

Oct4, Sox2 Klf4, c-Myc viruses

Reprogram into ESlike iPS cells

Genetically identical mutant iPS cells

23 human subjects. Also like ES cells, the iPS cells in current use have the same tissue antigens as the donors who originally provided them, so they would stimulate an immune attack if they were to be transplanted into other human recipients. Unlike the formation of ES cells, however, it will be much easier to generate personalized, tissuecompatible iPS cells, because they can be derived from a simple skin biopsy from each patient. Still, it does take considerable time, expense, and technical expertise to generate a population of iPS cells from a specific donor. Consequently, if iPS cells are ever developed for widespread therapeutic use, they would likely come from a large cell bank that could provide cells that are close tissue matches to most potential recipients. It may also be possible to remove all of the genes from iPS cells that normally prevent them from being transplanted into random recipients.

Direct Cell Reprogramming In 2008 the field of cellular reprogramming took another unexpected turn with the announcement that one type of differentiated cell had been converted directly into another type of differentiated cell, a case of “transdifferentiation.” In this report, the acinar cells of the pancreas, which produce enzymes responsible for digestion of food in the intestine, were transformed into pancreatic beta cells, which synthesize and secrete the hormone insulin. The reprogramming

1.4 | Viruses By the end of the nineteenth century, the work of Louis Pasteur and others had convinced the scientific world that infectious diseases of plants and animals were due to bacteria. But studies of tobacco mosaic disease in tobacco plants and hoof-and-mouth disease in cattle pointed to the existence of another type of infectious agent. It was found, for example, that sap from a diseased tobacco plant could transmit mosaic disease to a healthy plant, even when the sap showed no evidence of bacteria in the light microscope. To gain further insight into the size and nature of the infectious agent, Dmitri Ivanovsky, a Russian biologist, forced the sap from a diseased plant through filters whose pores were so small that they retarded the passage of the smallest known bacterium. The filtrate was still infective, causing Ivanovsky to conclude in 1892

Protein coat (capsid)

process occurred directly, in a matter of a few days, without the cells passing through an intermediate stem cell state—and it occurred while the cells remained in their normal residence within the pancreas of a live mouse. This feat was accomplished by injection into the animals of viruses that carried three genes known to be important in differentiation of beta cells in the embryo. In this case, the recipients of the injection were diabetic mice, and the transdifferentiation of a significant number of acinar cells into beta cells allowed the animals to regulate their blood sugar levels with much lower doses of insulin. It is also noteworthy that the adenoviruses used to deliver the genes in this experiment do not become a permanent part of the recipient cell, which removes some of the concerns about the use of viruses as gene carriers in humans. Since this initial report, a number of laboratories have developed in vitro techniques to directly convert one type of differentiated cell (typically a fibroblast) into another type of cell, such as a neuron, cardiomyocyte, or blood-cell precursor, in culture, without passing through a pluripotent intermediate. In all of these cases, transdifferentiation occurs when the original cells are forced to express certain genes that play a role in the normal embryonic differentiation of the other cell type. It is too early to know whether this type of direct reprogramming strategy has therapeutic potential, but it certainly raises the prospect that diseased cells that need to be replaced might be formed directly from other types of cells within the same organ.

that certain diseases were caused by pathogens that were even smaller, and presumably simpler, than the smallest known bacteria. These pathogens became known as viruses. In 1935, Wendell Stanley of the Rockefeller Institute reported that the virus responsible for tobacco mosaic disease could be crystallized and that the crystals were infective. Substances that form crystals have a highly ordered, well-defined structure and are vastly less complex than the simplest cells. Stanley mistakenly concluded that tobacco mosaic virus (TMV) was a protein. In fact, TMV is a rod-shaped particle consisting of a single molecule of RNA surrounded by a helical shell composed of protein subunits (Figure 1.21).

Nucleic acid

(a)

60 nm

protein subunits from the middle part of the upper particle and the ends of the lower particle. Intact rods are approximately 300 nm long and 18 nm in diameter. (A&B: COURTESY OF GERALD STUBBS, KEUCHI NAMBA, AND DONALD CASPAR.)

1.4 Viruses

Figure 1.21 Tobacco mosaic virus (TMV). (a) Model of a portion of a TMV particle. The protein subunits, which are identical along the entire rod-shaped particle, enclose a single helical RNA molecule (red). (b) Electron micrograph of TMV particles after phenol has removed the

(b)

24

gp120 coat protein

RNA

(a)

Reverse transcriptase

Protein coat Nucleic acid

Lipid bilayer

(b)

Figure 1.22 Virus diversity. The structures of (a) an adenovirus, (b) a human immunodeficiency virus (HIV), and (c) a T-even bacteriophage. Note: These viruses are not drawn to the same scale.

Chapter 1 Introduction to the Study of Cell and Molecular Biology

(c)

Viruses are responsible for dozens of human diseases, including AIDS, polio, influenza, cold sores, measles, and a few types of cancer. Viruses occur in a wide variety of very different shapes, sizes, and constructions, but all of them share certain common properties. All viruses are obligatory intracellular parasites; that is, they cannot reproduce unless present within a host cell. Depending on the specific virus, the host may be a plant, animal, or bacterial cell. Outside of a living cell, the virus exists as a particle, or virion, which is little more than a macromolecular package. The virion contains a small amount of genetic material that, depending on the virus, can be single-stranded or double-stranded, RNA or DNA. Remarkably, some viruses have as few as three or four different genes, but others may have as many as several hundred. The genetic material of the virion is surrounded by a protein capsule, or capsid. Virions are macromolecular aggregates, inanimate particles that by themselves are unable to reproduce, metabolize, or carry on any of the other activities associated with life. For this reason, viruses are not considered to be organisms and are not described as being alive. Viral capsids are generally made up of a specific number of subunits. There are numerous advantages to construction by subunit, one of the most apparent being an economy of genetic information. If a viral coat is made of many copies of a single protein, as is that of TMV, or a few proteins, as are the coats of many other viruses, the virus needs only one or a few genes to code for its protein container. Many viruses have a capsid whose subunits are organized into a polyhedron, that is, a structure having planar faces. A particularly common polyhedral shape of viruses is the 20-sided icosahedron. For example, adenovirus,

which causes respiratory infections in mammals, has an icosahedral capsid (Figure 1.22a). In many animal viruses, including the human immunodeficiency virus (HIV) responsible for AIDS, the protein capsid is surrounded by a lipid-containing outer envelope that is derived from the modified plasma membrane of the host cell as the virus buds from the host-cell surface (Figure 1.22b). Bacterial viruses, or bacteriophages, are among the most complex viruses (Figure 1.22c). They are also the most abundant biological entities on Earth. The T bacteriophages (which were used in key experiments that revealed the structure and properties of the genetic material) consist of a polyhedral head containing DNA, a cylindrical stalk through which the DNA is injected into the bacterial cell, and tail fibers, which together cause the particle to resemble a landing module for the moon. Each virus has on its surface a protein that is able to bind to a particular surface component of its host cell. For example, the protein that projects from the surface of the HIV particle (labeled gp120 in Figure 1.22b, which stands for glycoprotein of molecular mass 120,000 daltons4) interacts with a specific protein (called CD4) on the surface of certain white blood cells, facilitating entry of the virus into its host cell. The interaction between viral and host proteins determines the specificity of the virus, that is, the types of host cells that the virus can enter and infect. Some viruses have a wide host range, being able to infect cells from a variety of different organs or host species. The virus that causes rabies, for example, is able 4

One dalton is equivalent to one unit of atomic mass, the mass of a single hydrogen (1H) atom.

25

to infect many different types of mammalian hosts, including dogs, bats, and humans. Most viruses, however, have a relatively narrow host range. This is true, for example, of human cold and influenza viruses, which are generally able to infect only the respiratory epithelial cells of human hosts. A change in host-cell specificity can have striking consequences. This point is dramatically illustrated by the 1918 influenza pandemic, which killed more than 30 million people

Phage being assembled Empty phage

(a)

Virus budding from cell

worldwide. The virus was especially lethal in young adults, who do not normally fall victim to influenza. In fact, the 675,000 deaths from this virus in the United States temporarily lowered average life expectancy by several years. In one of the most acclaimed—and controversial—feats of the past few years, researchers have been able to determine the genomic sequence of the virus responsible for this pandemic and to reconstitute the virus in its full virulent state. This was accomplished by isolating the viral genes (which are part of a genome consisting of 8 separate RNA molecules encoding 11 different proteins) from the preserved tissues of victims who had died from the infection 90 years earlier. The best preserved samples were obtained from a Native American woman who had been buried in the Alaskan permafrost. The sequence of the “1918 virus” suggested that the pathogen had jumped from birds to humans. Although the virus had accumulated a considerable number of mutations, which adapted it to a mammalian host, it had never exchanged genetic material with that of a human influenza virus as had been thought likely. Analysis of the sequence of the 1918 virus has provided some clues to explain why it was so deadly and how it spread so efficiently from one human to another. Using the genomic sequence, researchers reconstituted the 1918 virus into infectious particles, which were found to be exceptionally virulent in laboratory tests. Whereas laboratory mice normally survive infection by modern human influenza viruses, the reconstituted 1918 strain killed 100 percent of infected mice and produced enormous numbers of viral particles in the animals’ lungs. Because of the potential risk to public health, publication of the full sequence of the 1918 virus and its reconstitution went forward only after approval by governmental safety panels and the demonstration that existing influenza vaccines and drugs protect mice from the reconstituted virus. There are two basic types of viral infection. (1) In most cases, the virus arrests the normal synthetic activities of the host and redirects the cell to use its available materials to manufacture viral nucleic acids and proteins, which assemble into new virions. Viruses, in other words, do not grow like cells; they are assembled from components directly into the mature-sized virions. Ultimately, the infected cell ruptures (lyses) and releases a new generation of viral particles capable of infecting neighboring cells. An example of this type of lytic infection is shown in Figure 1.23a. (2) In other cases, the infecting virus does not lead to the death of the host cell, but instead inserts (integrates) its DNA into the DNA of the host cell’s chromosomes. The integrated viral DNA is called a provirus. An integrated provirus can have different effects depending on the type of virus and host cell. For example, ■

(b)

■

1.4 Viruses

Figure 1.23 A virus infection. (a) Micrograph showing a late stage in the infection of a bacterial cell by a bacteriophage. Virus particles are being assembled within the cell, and empty phage coats are still present on the cell surface. (b) Micrograph showing HIV particles budding from an infected human lymphocyte. (A: COURTESY OF JONATHAN KING AND ERIKA HARTWIG; B: COURTESY OF HANS GELDERBLOM.)

Bacterial cells containing a provirus behave normally until exposed to a stimulus, such as ultraviolet radiation, that activates the dormant viral DNA, leading to the lysis of the cell and release of viral progeny. Some animal cells containing a provirus produce new viral progeny that bud at the cell surface without lysing the infected cell. Human immunodeficiency virus (HIV) acts in this way; an infected cell may remain alive for a period, acting as a factory for the production of new virions (Figure 1.23b).

26 ■

Some animal cells containing a provirus lose control over their own growth and division and become malignant. This phenomenon is readily studied in the laboratory by infecting cultured cells with the appropriate tumor virus.

Viruses are not without their virtues. Because the activities of viral genes mimic those of host genes, investigators have used viruses for decades as a research tool to study the mechanism of DNA replication and gene expression in their much more complex hosts. In addition, viruses are now being used as a means to introduce foreign genes into human cells, a technique that will likely serve as the basis for the treatment of human diseases by gene therapy. Lastly, insect- and bacteria-killing viruses may play an increasing role in the war against insect pests and bacterial pathogens. Bacteriophages have been used for decades to treat bacterial infections in eastern Europe and Russia, while physicians in the West have relied on antibiotics. Given the rise in antibiotic-resistant bacteria, bacteriophages may be making a comeback on the heels of promising studies on infected mice. Several biotechnology companies are now producing bacteriophages intended to combat bacterial infections and to protect certain foods from bacterial contamination.

Viroids It came as a surprise in 1971 to discover that viruses are not the simplest types of infectious agents. In that year, T. O. Diener of the U.S. Department of Agriculture reported that potato spindle-tuber disease, which causes potatoes to become

gnarled and cracked, is caused by an infectious agent consisting of a small circular RNA molecule that totally lacks a protein coat. Diener named the pathogen a viroid. The RNAs of viroids range in size from about 240 to 600 nucleotides, onetenth the size of the smaller viruses. No evidence has been found that the naked viroid RNA encodes any proteins. Rather, any biochemical activities in which viroids engage take place using host-cell proteins. For example, duplication of the viroid RNA within an infected cell utilizes the host’s RNA polymerase II, an enzyme that normally transcribes the host’s DNA into messenger RNAs. Viroids are thought to cause disease by interfering with the cell’s normal path of gene expression. The effect on crops can be serious: a viroid disease called cadang-cadang has devastated the coconut palm groves of the Philippines, and another viroid has wreaked havoc on the chrysanthemum industry in the United States. The discovery of a different type of infectious agent even simpler than a viroid is described in the Human Perspective in Chapter 2.

REVIEW 1. What properties distinguish a virus from a bacterium? 2. What types of infections are viruses able to cause? 3. Compare and contrast: nucleoid and nucleus; the flagellum of a bacterium and a sperm; an archaebacterium and a cyanobacterium; nitrogen fixation and photosynthesis; bacteriophages and tobacco mosaic virus; a provirus and a virion.

Chapter 1 Introduction to the Study of Cell and Molecular Biology

E X P E R I M E N TA L

P AT H W AY S

The Origin of Eukaryotic Cells We have seen in this chapter that cells can be divided into two groups: prokaryotic cells and eukaryotic cells. Almost from the time this division of cellular life was proposed, biologists have been fascinated by the question: What is the origin of the eukaryotic cell? It is generally (but not universally) agreed that prokaryotic cells (1) arose before eukaryotic cells and (2) gave rise to eukaryotic cells. The first point can be verified directly from the fossil record, which shows that prokaryotic cells were present in rocks approximately 2.7 billion years old (page 7), which is roughly one billion years before any evidence is seen of eukaryotes. The second point follows from the fact that the two types of cells have to be related to one another because they share many complex traits (e.g., very similar genetic codes, enzymes, metabolic pathways, and plasma membranes) that could not have evolved independently in different organisms. Up until about 1970, it was generally believed that eukaryotic cells evolved from prokaryotic cells by a process of gradual evolution in which the organelles of the eukaryotic cell became progressively more complex. Acceptance of this concept changed dramatically about that time largely through the work of Lynn Margulis, then at Boston University. Margulis resurrected an idea that had been proposed earlier, and dismissed, that certain organelles of a eukaryotic cell—most notably the mitochondria and chloroplasts—had evolved from smaller prokaryotic cells that had taken up residence in the cytoplasm of a larger host cell.1,2 This hypothesis is referred to as the

endosymbiont theory because it describes how a single “composite” cell of greater complexity could evolve from two or more separate, simpler cells living in a symbiotic relationship with one another. Our earliest prokaryotic ancestors were presumed to have been anaerobic heterotrophic cells: anaerobic meaning they derived their energy from food matter without employing molecular oxygen (O2) and heterotrophic meaning they were unable to synthesize organic compounds from inorganic precursors (such as CO2 and water), but instead had to obtain preformed organic compounds from their environment. According to one version of the endosymbiont theory, a large, anaerobic, heterotrophic prokaryote ingested a small, aerobic prokaryote (step 1, Figure 1). Resisting digestion within the cytoplasm, the small aerobic prokaryote took up residence as a permanent endosymbiont. As the host cell reproduced, so did the endosymbiont, so that a colony of these composite cells was soon produced. Over many generations, endosymbionts lost many of the traits that were no longer required for survival, and the once-independent oxygen-respiring microbes evolved into precursors of modern-day mitochondria (step 2, Figure 1). A cell whose ancestors had formed through the sequence of symbiotic events just described could have given rise to a line of cells that evolved other basic characteristics of eukaryotic cells, including a system of membranes (a nuclear membrane, endoplasmic reticulum,

27 Figure 1 A model depicting possible steps in the evolution of eukaryotic cells, including the origin of mitochondria and chloroplasts by endosymbiosis. In step 1, a large anaerobic, heterotrophic prokaryote takes in a small aerobic prokaryote. Evidence strongly indicates that the engulfed prokaryote was an ancestor of modern-day rickettsia, a group of bacteria that causes typhus and other diseases. In step 2, the aerobic endosymbiont has evolved into a mitochondrion. In step 3, a portion of the plasma membrane has invaginated and is seen in the process of evolving into a nuclear envelope and associated endoplasmic reticulum. The pre-eukaryote depicted in step 3 gives rise to two major groups of eukaryotes. In one path (step 4), a primitive eukaryote evolves into nonphotosynthetic protist, fungal, and animal cells. In the other path (step 5), a primitive eukaryote takes in a photosynthetic prokaryote, which will become an endosymbiont and evolve into a chloroplast. (Note: The engulfment of the symbiont shown in step 1 could have occurred after development of some of the internal membranes, but evidence suggests it was a relatively early step in the evolution of eukaryotes.)

Golgi complex, lysosomes), a complex cytoskeleton, and a mitotic type of cell division. These characteristics are proposed to have arisen by a gradual process of evolution, rather than in a single step as might occur through acquisition of an endosymbiont. The endoplasmic reticulum and nuclear membranes, for example, might have been derived from a portion of the cell’s outer plasma membrane that became internalized and then modified into a different type of membrane (step 3, Figure 1). A cell that possessed internal membrane-bound compartments would have been an ancestor of a heterotrophic eukaryotic cell, such as a fungal cell or a protist (step 4, Figure 1). The oldest fossils thought to be the remains of eukaryotes date back about 1.8 billion years. Margulis proposed that the acquisition of another endosymbiont, specifically a cyanobacterium, converted an early heterotrophic eukaryote into an ancestor of photosynthetic eukaryotes: the green algae and plants (step 5, Figure 1).3 The acquisition of chloroplasts (roughly one billion years ago) must have been one of the last steps in the sequence of endosymbioses because these organelles are only present in plants and algae. In contrast, all known groups of eukaryotes either (1) possess mitochondria or (2) show definitive evidence they have evolved from organisms that possessed these organelles.a The concept that mitochondria and chloroplasts arose via evolution from symbiotic organisms is now supported by an overwhelming body of evidence, some of which will be described in numerous chapters of this text. The division of all living organisms into two categories, prokaryotes and eukaryotes, reflects a basic dichotomy in the structures of cells, but it is not necessarily an accurate phylogenetic distinction, that is, one that reflects the evolutionary relationships among living organisms. How does one determine evolutionary relationships among organisms that have been separated in time for billions of years, such as prokaryotes and eukaryotes? Most taxonomic schemes that attempt to classify organisms are based heavily on anatomic or There are a number of anaerobic unicellular eukaryotes (e.g., the intestinal parasite Giardia) that lack mitochondria. For years, these organisms formed the basis for a proposal that mitochondrial endosymbiosis was a late event that took place after the evolution of these mitochondria-lacking groups. However, recent analysis of the nuclear DNA of these organisms indicates the presence of genes that were likely transferred to the nucleus from mitochondria, suggesting that the ancestors of these organisms lost their mitochondria during the course of evolution.

Plasma membrane DNA 1

Aerobic prokaryote

Aerobic, heterotrophic prokaryote Mitochondrion 2

Pre-eukaryote

3

Plasma membrane invagination

Nuclear envelope precursor

Primitive eukaryotic cell

Endoplasmic reticulum precursor 5

4

Protist, fungal, animal cells

Photosynthetic cyanobacterium

Algal and plant cells

Chloroplast Vacuole Nucleus Endoplasmic reticulum Cell wall

physiologic characteristics. In 1965, Emile Zuckerkandl and Linus Pauling suggested an alternate approach based on comparisons of the structure of informational molecules (proteins and nucleic acids) of living organisms.4 Differences between organisms in the sequence of amino acids that make up a protein or the sequence of nucleotides that make up a nucleic acid are the result of mutations in DNA that have been transmitted to offspring. Mutations can accumulate in a given gene at a relatively constant rate over long periods of time. Consequently, comparisons of amino acid or nucleotide sequences can be used to determine how closely organisms are related to one

Experimental Pathways

a

Anaerobic, heterotrophic prokaryote

Chapter 1 Introduction to the Study of Cell and Molecular Biology

28

16S rRNA CHLOROPLAST Euglena

3'

pH 2.3

5'

pH 3.5

another. For example, two organisms that are closely related, that is, have diverged only recently from a common ancestor, should have fewer sequence differences in a particular gene than two organisms that are distantly related, that is, do not have a recent common ancestor. Using this type of sequence information as an “evolutionary clock,” researchers can construct phylogenetic trees showing proposed pathways by which different groups of living organisms may have diverged from one another during the course of evolution. Beginning in the mid-1970s, Carl Woese and his colleagues at the University of Illinois began a series of studies that compared the nucleotide sequence in different organisms of the RNA molecule that resides in the small subunit of the ribosome. This RNA—which is called the 16S rRNA in prokaryotes or the 18S rRNA in eukaryotes— was chosen because it is present in large quantities in all cells, it is easy to purify, and it tends to change only slowly over long periods of evolutionary time, which means that it could be used to study relationships of very distantly related organisms. There was one major disadvantage: nucleic acid sequencing at the time required laborious, time-consuming methods. In their approach, they purified the 16S rRNA from a particular source, then subjected the preparation to an enzyme, T1 ribonuclease, that digests the molecule into short fragments called oligonucleotides. The oligonucleotides in the mixture were then separated from one another by two-dimensional electrophoresis to produce a two-dimensional “fingerprint” as shown in Figure 2. Once they were separated, the nucleotide sequence of each of the oligonucleotides could be determined and the sequences from various organisms compared. In one of their first studies, Woese and his colleagues analyzed the 16S rRNA present in the ribosomes of chloroplasts from the photosynthetic protist Euglena.5 They found that the sequence of this chloroplast rRNA molecule was much more similar to that of the 16S rRNA found in ribosomes of cyanobacteria than it was to its counterpart in the ribosomes from eukaryotic cytoplasm. This finding provided strong evidence for the symbiotic origin of chloroplasts from cyanobacteria. In 1977, Woese and George Fox published a landmark paper in the study of molecular evolution.6 They compared the nucleotide sequences of small-subunit rRNAs that had been purified from 13 different prokaryotic and eukaryotic species. The data from a comparison of all possible pairs of these organisms are shown in Table 1. The numbers along the top identify the organisms by the same

Figure 2 Two-dimensional electrophoretic fingerprint of a T1 digest of chloroplast 16S ribosomal RNA. The RNA fragments were electrophoresed in one direction at pH 3.5 and then in a second direction at pH 2.3. (FROM L. B. ZABLEN ET AL., PROC. NAT ’L. ACAD. SCI. U.S.A. 72:2419, 1975. COURTESY CARL WOESE, UNIVERSITY OF ILLINOIS.)

numbers used along the left margin of the table. Each value in the table reflects the similarity in sequence between rRNAs from the two organisms being compared: the lower the number, the less similar the two sequences. They found that the sequences clustered into three distinct groups, as indicated in the table. It is evident that the rRNAs within each group (numbers 1–3, 4–9, and 10–13) are much more similar to one another than they are to rRNAs of the other two groups. The first of the groups shown in the table contains only eukaryotes; the second group contains the “typical” bacteria (gram-positive, gramnegative, and cyanobacteria); and the third group contains several species of methanogenic (methane-producing) “bacteria.” Woese and Fox concluded, to their surprise, that the methanogenic organisms “appear to be no more related to typical bacteria than they are to eukaryotic cytoplasms.” These results suggested that the members of these three groups represent three distinct evolutionary lines that branched apart from one another at a very early stage in the evolution of cellular organisms. Consequently, they assigned these organisms to three different kingdoms, which they named the Urkaryotes, Eubacteria, and Archaebacteria, a terminology that divided the prokaryotes into two fundamentally distinct groups.

Table 1 Nucleotide Sequence Similarities between Representative Members of the Three Primary Kingdoms

1. Saccharomyces cerevisiae, 18S 2. Lemna minor, 18S 3. L cell, 18S 4. 5. 6. 7. 8. 9.

Escherichia coli Cholorbium vibrioforme Bacillus firmus Corynebacterium diptheriae Aphanocapsa 6714 Chloroplast (Lemna)

10. Methanebacterium thermoautotrophicum 11. M. ruminantium strain M-1 12. Methanobacterium sp., Cariaco isolate JR-1 13. Methanosarcina barkeri

1

2

3

4

5

6

7

8

9

10

11

12

13

— 0.29 0.33

0.29 — 0.36

0.33 0.36 —

0.05 0.10 0.06

0.06 0.05 0.06

0.08 0.06 0.07

0.09 0.10 0.07

0.11 0.09 0.09

0.08 0.11 0.06

0.11 0.10 0.10

0.11 0.10 0.10

0.08 0.13 0.09

0.08 0.08 0.07

0.05 0.06 0.08 0.09 0.11 0.08

0.10 0.05 0.06 0.10 0.09 0.11

0.06 0.06 0.07 0.07 0.09 0.06

— 0.24 0.25 0.28 0.26 0.21

0.24 — 0.22 0.22 0.20 0.19

0.25 0.22 — 0.34 0.26 0.20

0.28 0.22 0.34 — 0.23 0.21

0.26 0.20 0.26 0.23 — 0.31

0.21 0.19 0.20 0.21 0.31 —

0.11 0.06 0.11 0.12 0.11 0.14

0.12 0.07 0.13 0.12 0.11 0.12

0.07 0.06 0.06 0.09 0.10 0.10

0.12 0.09 0.12 0.10 0.10 0.12

0.11 0.11

0.10 0.10

0.10 0.10

0.11 0.12

0.06 0.07

0.11 0.13

0.12 0.12

0.11 0.11

0.14 0.12

— 0.51

0.51 —

0.25 0.25

0.30 0.24

0.08 0.08

0.13 0.07

0.09 0.07

0.07 0.12

0.06 0.09

0.06 0.12

0.09 0.10

0.10 0.10

0.10 0.12

0.25 0.30

0.25 0.24

— 0.32

0.32 —

Source: C. R. Woese and G. E. Fox, Proc. Nat’l. Acad. Sci. U.S.A. 74:5089, 1977.

29

Green Non-sulfur bacteria

Purple bacteria Cyanobacteria Flavobacteria

Eucarya

Archaea

Bacteria

Gram positive

Animals

Euryarchaeota

Entamoebae Slime molds Fungi

Crenarchaeota

Methanosarcina MethanoHalophiles bacterium Methanococcus Thermoproteus T.celer Pyrodictium

Plants Ciliates Flagellates Trichomonads

Thermotogales Microsporidia Diplomonads

b

Many biologists dislike the terms archaebacteria and eubacteria. Although these terms have gradually faded from the literature, being replaced simply by archaea and bacteria, many researchers in this field continue to use the former terms in published articles. Given that this is an introductory chapter in an introductory text, I have continued to refer to these organisms as archaebacteria and eubacteria to avoid possible confusion over the meaning of the term bacterial.

the lines of distinction between the three domains.10 For example, the genomes of several archaebacteria showed the presence of a significant number of eubacterial genes. For the most part, those genes in archaebacteria whose products are involved with informational processes (chromosome structure, transcription, translation, and replication) were very different from their counterparts in eubacterial cells and, in fact, resembled the corresponding genes in eukaryotic cells. This observation fit nicely with the scheme in Figure 3. In contrast, many of the genes in archaebacteria that encode the enzymes of metabolism exhibited an unmistakable eubacterial character.11,12 The genomes of eubacterial species also showed evidence of a mixed origin, often containing a significant number of genes that bore an archaebacterial character.13 Most investigators who study the origin of ancient organisms have held on to the basic outline of the phylogenetic tree as demarcated in Figure 3 and argue that the presence of eubacteria-like genes in archaebacteria, and vice versa, is the result of the transfer of genes from one species to another, a phenomenon referred to as lateral gene transfer (LGT).14 According to the original premise that led to the phylogenetic tree of Figure 3, genes are inherited from one’s parents, not from one’s neighbors. This is the premise that allows an investigator to conclude that two species are closely related when they both possess a gene (e.g., the rRNA gene) of similar nucleotide sequence. If, however, cells can pick up genes from other species in their environment, then two species that are actually unrelated may possess genes of very similar sequence. An early measure of the importance of lateral gene transfer in the evolution of prokaryotes came from a study that compared the genomes of two related eubacteria, Escherichia and Salmonella. It was found that 755 genes or nearly 20 percent of the E. coli genome is derived from “foreign” genes transferred into the E. coli genome over the past 100 million years, which is the time when the two eubacteria diverged. These 755 genes were acquired as the result of at least 234 separate lateral transfers from many different sources.15 (The effect of lateral gene transfer on antibiotic resistance in pathogenic bacteria is discussed in the Human Perspective of Chapter 3.) If genomes are a mosaic composed of genes from diverse sources, how does one choose which genes to use in determining phylogenetic relationships? According to one viewpoint, genes that are involved in informational activities (transcription, translation, replication) make the best subjects for determining phylogenetic relationships, because such genes are less likely to be transferred laterally than genes involved in metabolic reactions.16 These authors argue that the products of informational genes (e.g., rRNAs) are parts of large complexes whose components must interact with many other molecules. It is unlikely that a foreign gene product could

Experimental Pathways

Subsequent research provided support for the concept that prokaryotes could be divided into two distantly related lineages, and it expanded the ranks of the archaebacteria to include at least two other groups, the thermophiles, which live in hot springs and ocean vents, and the halophiles, which live in very salty lakes and seas. In 1989, two published reports rooted the tree of life and suggested that the archaebacteria were actually more closely related to eukaryotes than they were to eubacteria.7,8 Both groups of researchers compared the amino acid sequences of several proteins that were present in a wide variety of different prokaryotes, eukaryotes, mitochondria, and chloroplasts. A phylogenetic tree constructed from sequences of ribosomal RNAs, which comes to the same conclusion, is shown in Figure 3.9 In this latter paper, Woese and colleagues proposed a revised taxonomic scheme, which has been widely accepted. In this scheme, the archaebacteria, eubacteria, and eukaryotes are assigned to separate domains, which are named Archaea, Bacteria, and Eucarya, respectively.b Each domain can then be divided into one or more kingdoms; the Eucarya, for example, can be divided into the traditional kingdoms containing fungi, protists, plants, and animals. According to the model in Figure 3, the first major split in the tree of life produced two separate lineages, one leading to the Bacteria and the other leading to both the Archaea and the Eucarya. If this view is correct, it was an archaebacterium, not a eubacterium, that took in a symbiont and gave rise to the lineage that led to the first eukaryotic cells. Although the host prokaryote was presumably an archaebacterium, the symbionts that evolved into mitochondria and chloroplasts were almost certainly eubacteria, as indicated by their close relationship with modern members of this group. Up until 1995, phylogenetic trees of the type shown in Figure 3 were based primarily on the analysis of the gene encoding the 16S–18S rRNA. By then, phylogenetic comparisons of a number of other genes were suggesting that the scheme depicted in Figure 3 might be oversimplified. Questions about the origin of prokaryotic and eukaryotic cells came into sharp focus between 1995 and 1997 with the publication of the entire sequences of a number of prokaryotic genomes, both archaebacterial and eubacterial, and the genome of a eukaryote, the yeast Saccharomyces cerevisiae. Researchers could now compare the sequences of hundreds of genes simultaneously, and this analysis raised a number of puzzling questions and blurred

Figure 3 A phylogenetic tree based on rRNA sequence comparisons showing the three domains of life.The Archaea are divided into two subgroups as indicated. (FROM C. R. WOESE ET AL., PROC. NAT ’L. ACAD. SCI. U.S.A. 87:4578, 1990.)

30 become integrated into the existing machinery. When “informational genes” are used as the subjects of comparison, archaebacteria and eubacteria tend to separate into distinctly different groups, whereas archaebacteria and eukaryotes tend to group together as evolutionary relatives, just as they do in Figure 3.See reference 17 for further discussion. Analysis of eukaryotic genomes have produced similar evidence of a mixed heritage. Studies of the yeast genome show unmistakable presence of genes derived from both archaebacteria and eubacteria. The “informational genes” tend to have an archaeal character and the “metabolic genes” a eubacterial character.18 There are several possible explanations for the mixed character of the eukaryotic genome. Eukaryotic cells may have evolved from archaebacterial ancestors and then picked up genes from eubacteria with which they shared environments. In addition, some of the genes in the nucleus of a eukaryotic cell are clearly derived from eubacterial genes that have been transferred from the genome of the symbionts that evolved into mitochondria and chloroplasts.19 A number of researchers have taken a more radical position and proposed that the eukaryote genome was originally derived from the fusion of an archaebacterial and a eubacterial cell followed by the integration of their two genomes.e.g., 20 Given these various routes of gene acquisition, it is evident that no simple phylogenetic tree, such as that depicted in Figure 3, can represent the evolutionary history of the entire genome of an organism.Reviewed in 21–23 Instead, each gene or group of genes of a particular genome may have its own unique evolutionary tree, which can be a disconcerting thought to scientists seeking to determine the origin of our earliest eukaryotic ancestors.

Chapter 1 Introduction to the Study of Cell and Molecular Biology

References 1. SAGAN (MARGULIS), L. 1967. On the origin of mitosing cells. J. Theor. Biol. 14:225–274. 2. MARGULIS, L. 1970. Origin of Eukaryotic Cells. Yale University Press. 3. SPIEGEL, F. W. 2012. Contemplating the first Plantae. Science 335:809-810. 4. ZUCKERKANDL, E. & PAULING, L. 1965. Molecules as documents of evolutionary history. J. Theor. Biol. 8:357–365. 5. ZABLEN, L. B., ET AL. 1975. Phylogenetic origin of the chloroplast and prokaryotic nature of its ribosomal RNA. Proc. Nat’l. Acad. Sci. U.S.A. 72:2418–2422.

6. WOESE, C. R. & FOX, G. E. 1977. Phylogenetic structure of the prokaryotic domain: The primary kingdoms. Proc. Nat’l. Acad. Sci. U.S.A. 74:5088–5090. 7. IWABE, N., ET AL. 1989. Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. Proc. Nat’l. Acad. Sci. U.S.A. 86:9355–9359. 8. GOGARTEN, J. P., ET AL. 1989. Evolution of the vacuolar H⫹-ATPase: Implications for the origin of eukaryotes. Proc. Nat’l. Acad. Sci. U.S.A. 86:6661–6665. 9. WOESE, C., ET AL. 1990. Towards a natural system of organisms: Proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Nat’l. Acad. Sci. U.S.A. 87:4576–4579. 10. DOOLITTLE, W. F. 1999. Lateral genomics. Trends Biochem. Sci. 24:M5–M8 (Dec.) 11. BULT, C. J., ET AL. 1996. Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii. Science 273:1058–1073. 12. KOONIN, E. V., ET AL. 1997. Comparison of archaeal and bacterial genomes. Mol. Microbiol. 25:619–637. 13. NELSON, K. E., ET AL., 1999. Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima. Nature 399:323–329. 14. OCHMAN, H., ET AL. 2000. Lateral gene transfer and the nature of bacterial innovation. Nature 405:299–304. 15. LAWRENCE, J. G. & OCHMAN, H. 1998. Molecular archaeology of the Escherichia coli genome. Proc. Nat’l. Acad. Sci. U.S.A. 95:9413–9417. 16. JAIN, R., ET AL. 1999. Horizontal gene transfer among genomes: The complexity hypothesis. Proc. Nat’l. Acad. Sci. U.S.A. 96:3801–3806. 17. MCINERNEY, J. O. & PISANI, D. 2007. Paradigm for life. Science 318:1390–1391. 18. RIVERA, M. C., ET AL. 1998. Genomic evidence for two functionally distinct gene classes. Proc. Nat’l. Acad. Sci. U.S.A. 95:6239–6244. 19. TIMMIS, J. N., ET AL. 2004. Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes. Nature Rev. Gen. 5:123–135. 20. MARTIN, W. & MÜLLER, M. 1998. The hydrogen hypothesis for the first eukaryote. Nature 392:37–41. 21. ZIMMER, C. 2009. On the origin of eukaryotes, Science 325:666–668. 22. STEEL, M. & PENNY, D. 2010. Common ancestry put to the test. Nature 465:168–169. 23. WILSON, K. L. & DAWSON, S. C. 2011. Functional evolution of nuclear structure. J. Cell Biol. 195:171–181.

| Synopsis The cell theory has three tenets. (1) All organisms are composed of one or more cells; (2) the cell is the basic organizational unit of life; and (3) all cells arise from preexisting cells. (p. 2) The properties of life, as exhibited by cells, can be described by a collection of properties. Cells are very complex and their substructure is highly organized and predictable. The information to build a cell is encoded in its genes. Cells reproduce by cell division; their activities are fueled by chemical energy; they carry out enzymatically controlled chemical reactions; they engage in numerous mechanical activities; they respond to stimuli; and they are capable of a remarkable level of self-regulation. (p. 3) Cells are either prokaryotic or eukaryotic. Prokaryotic cells are found only among archaebacteria and eubacteria, whereas all other types of organisms—protists, fungi, plants, and animals—are composed of eukaryotic cells. Prokaryotic and eukaryotic cells share many common features, including a similar cellular membrane, a common system for storing and using genetic information, and similar metabolic pathways. Prokaryotic cells are the simpler type,

lacking the complex membranous organelles (e.g., endoplasmic reticulum, Golgi complex, mitochondria, and chloroplasts), chromosomes, and cytoskeleton characteristic of the cells of eukaryotes. The two cell types can also be distinguished by their mechanism of cell division, their locomotor structures, and the type of cell wall they produce (if a cell wall is present). Complex plants and animals contain many different types of cells, each specialized for particular activities. (p.7) Cells are almost always microscopic in size. Bacterial cells are typically 1 to 5 ␮m in length, whereas eukaryotic cells are typically 10 to 30 ␮m. Cells are microscopic in size for a number of reasons: their nuclei possess a limited number of copies of each gene; the surface area (which serves as the cell’s exchange surface) becomes limiting as a cell increases in size; and the distance between the cell surface and interior becomes too great for the cell’s needs to be met by simple diffusion. (p. 17) Viruses are noncellular pathogens that can only reproduce when present within a living cell. Outside of the cell, the virus exists as a

31 macromolecular package, or virion. Virions occur in a variety of shapes and sizes, but all of them consist of viral nucleic acid enclosed in a wrapper containing viral proteins. Viral infections may lead to either (1) the destruction of the host cell with accompanying pro-

duction of viral progeny, or (2) the integration of viral nucleic acid into the DNA of the host cell, which often alters the activities of that cell. Viruses are not considered to be living organisms. (p. 23)

| Analytic Questions 1. Consider some question about cell structure or function that

9. Examine the photograph of the ciliated protist in Figure 1.16

you would be interested in answering. Would the data required to answer the question be easier to collect by working on an entire plant or animal or on a population of cultured cells? What might be the advantages and disadvantages of working on a whole organism versus a cell culture? Figure 1.3 shows an intestinal epithelial cell with large numbers of microvilli. What is the advantage to the organism of having these microvilli? What do you expect would happen to an individual that lacked such microvilli as the result of an inherited mutation? The first human cells to be successfully cultured were derived from a malignant tumor. Do you think this simply reflects the availability of cancer cells, or might such cells be better subjects for cell culture? Why? The drawings of plant and animal cells in Figure 1.8b,c include certain structures that are present in plant cells but absent in animal cells. How do you think each of these structures affects the life of the plant? It was noted that cells possess receptors on their surface that allow them to respond to specific stimuli. Many cells in the human body possess receptors that allow them to bind specific hormones that circulate in the blood. Why do you think these hormone receptors are important? What would be the effect on the physiological activities of the body if cells lacked these receptors, or if all cells had the same receptors? If you were to argue that viruses are living organisms, what features of viral structure and function might you use in your argument? If we presume that activities within cells do occur in a manner analogous to that shown in the Rube Goldberg cartoon of Figure 1.7, how would this differ from a human activity, such as building a car on an assembly line or shooting a free throw in a basketball game? Unlike bacterial cells, the nucleus of a eukaryotic cell is bounded by a double-layered membrane studded by complex pores. How do you think this might affect traffic between the DNA and cytoplasm of a eukaryotic cell compared to that of a prokaryotic cell?

and consider some of the activities in which this cell engages that a muscle or nerve cell in your body does not. Which type of cell would you expect to achieve the largest volume: a highly flattened cell or a spherical cell? Why? Suppose you were a scientist living in the 1890s and were studying a disease of tobacco crops that stunted the growth of the plants and mottled their leaves. You find that the sap from a diseased plant, when added to a healthy plant, is capable of transmitting the disease to that plant. You examine the sap in the best light microscopes of the period and see no evidence of bacteria. You force the sap through filters whose pores are so small that they retard the passage of the smallest known bacteria, yet the fluid that passes through the filters is still able to transmit the disease. Like Dimitri Ivanovsky, who conducted these experiments more than a hundred years ago, you would probably conclude that the infectious agent was an unknown type of unusually small bacterium. What kinds of experiments might you perform today to test this hypothesis? Most evolutionary biologists believe that all mitochondria have evolved from a single ancestral mitochondrion and all chloroplasts have evolved from a single ancestral chloroplast. In other words, the symbiotic event that gave rise to each of these organelles occurred only once. If this is the case, where on the phylogenetic tree of Figure 3, page 29, would you place the acquisition of each of these organelles? Publication of the complete sequence of the 1918 flu virus and reconstitution of active viral particles was met with great controversy. Those who favored publication of the work argued that this type of information can help to better understand the virulence of influenza viruses and help develop better therapeutics against them. Those opposed to its publication argued that the virus could be reconstituted by bioterrorists or that another pandemic could be created by the accidental release of the virus by a careless investigator. What is your opinion on the merits of conducting this type of work?

2.

3.

4.

5.

6.

7.

8.

10. 11.

12.

13.

Analytic Questions

32

32

2 The Chemical Basis of Life Covalent Bonds Noncovalent Bonds Acids, Bases, and Buffers The Nature of Biological Molecules Four Types of Biological Molecules The Formation of Complex Macromolecular Structures THE HUMAN PERSPECTIVE: Free Radicals as a Cause of Aging Protein Misfolding Can Have Deadly Consequences EXPERIMENTAL PATHWAYS: Chaperones: Helping Proteins Reach Their Proper Folded State Chapter 2 The Chemical Basis of Life

2.1 2.2 2.3 2.4 2.5 2.6

We will begin this chapter with a brief examination of the atomic basis of matter, a subject that may seem out of place in a biology textbook. Yet life is based on the properties of atoms and is governed by the same principles of chemistry and physics as all other types of matter. The cellular level of organization is only a small step from the atomic level, as will become evident when we examine the importance of the movement of a few atoms of a molecule during such activities as muscle contraction or the transport of substances across cell membranes. The properties of cells and their organelles derive directly from the activities of the molecules of which they are composed. Consider a process such as cell division, which can be followed in considerable detail under a simple light microscope. To understand the activities that occur when a cell divides, one needs to know, for example, about the interactions between DNA and protein molecules that cause the chromosomes to condense into rod-shaped packages that can be separated into different cells; the molecular construction of protein-containing microtubules that allows them to disassemble at one moment in the cell and reassemble the next moment in an entirely different cellular location; and the properties of lipid molecules that make the outer cell membrane deformable so that it can be pulled into the middle of a cell, thereby pinching the cell in two. It is impossible even to begin to understand

A complex between two different macromolecules. A portion of a DNA molecule (shown in blue) is complexed to a protein that consists of two polypeptide subunits, one red and the other yellow. Those parts of the protein that are seen to be inserted into the grooves of the DNA have recognized and bound to a specific sequence of nucleotides in the nucleic acid molecule. (COURTESY OF A. R. FERRÉ-D’AMARÊ AND STEPHEN K. BURLEY.)

33 cellular function without a reasonable knowledge of the structure and properties of the major types of biological molecules. This is the goal of the present chapter: to provide the necessary

information about the chemistry of life to allow the reader to understand the basis of life. We will begin by considering the types of bonds that atoms can form with one another.

2.1 | Covalent Bonds

atom is filled when it contains two electrons; the outer shells of the other atoms in Figure 2.1 are filled when they contain eight electrons. Thus, an oxygen atom, with six outer-shell electrons, can fill its outer shell by combining with two hydrogen atoms, forming a molecule of water. The oxygen atom is linked to each hydrogen atom by a single covalent bond (denoted as H:O or H—O). The formation of a covalent bond is accompanied by the release of energy, which must be reabsorbed at some later time if the bond is to be broken. The energy required to cleave C—H, C—C, or C—O covalent bonds is quite large—typically between 80 and 100 kilocalories per mole (kcal/mol)1 of molecules—making these bonds stable under most conditions.

The atoms that make up a molecule are joined together by covalent bonds in which pairs of electrons are shared between pairs of atoms. The formation of a covalent bond between two atoms is governed by the fundamental principle that an atom is most stable when its outermost electron shell is filled. Consequently, the number of bonds an atom can form depends on the number of electrons needed to fill its outer shell. The electronic structure of a number of atoms is shown in Figure 2.1. The outer (and only) shell of a hydrogen or helium

1

One calorie is the amount of thermal energy required to raise the temperature of one gram of water one degree centigrade. One kilocalorie (kcal) equals 1000 calories (or one large Calorie). In addition to calories, energy can also be expressed in Joules, which is a term that was used historically to measure energy in the form of work. One kilocalorie is equivalent to 4186 Joules. Conversely, 1 Joule 0.239 calories. A mole is equal to Avogadro’s number (6 1023) of molecules. A mole of a substance is its molecular weight expressed in grams.

+1 H First electron shell

+3

+6

+8

+9

+10

Li Second electron shell

C

O

F

Ne

+11

+14

+16

+17

+18

Na Third electron shell

Si

S

Cl

Ar

+4

+2

+1

–1

Electrons needed for atoms in each column to achieve stability

determinant of the chemical properties of an element. Atoms with a similar number of outer-shell electrons have similar properties. Lithium (Li) and sodium (Na), for example, have one outer-shell electron, and both are highly reactive metals. Carbon (C) and silicon (Si) atoms can each bond with four different atoms. Because of its size, however, a carbon atom can bond to other carbon atoms, forming long-chained organic molecules, whereas silicon is unable to form comparable molecules. Neon (Ne) and argon (Ar) have filled outer shells, making these atoms highly nonreactive; they are referred to as inert gases.

2.1 Covalent Bonds

Figure 2.1 A representation of the arrangement of electrons in a number of common atoms. Electrons are present around an atom’s nucleus in “clouds” or orbitals that are roughly defined by their boundaries, which may have a spherical or dumbbell shape. Each orbital contains a maximum of two electrons, which is why the electrons (dark dots in the drawing) are grouped in pairs. The innermost shell contains a single orbital (thus two electrons), the second shell contains four orbitals (thus eight electrons), the third shell also contains four orbitals, and so forth. The number of outer-shell electrons is a primary

0 (Inert elements)

34

In many cases, two atoms can become joined by bonds in which more than one pair of electrons are shared. If two electron pairs are shared, as occurs in molecular oxygen (O2), the covalent bond is a double bond, and if three pairs of electrons are shared (as in molecular nitrogen, N2), it is a triple bond. Quadruple bonds are not known to occur. The type of bond between atoms has important consequences in determining the shapes of molecules. For example, atoms joined by a single bond are able to rotate relative to one another, whereas the atoms of double (and triple) bonds lack this ability. As illustrated in Figure 6.6, double bonds can function as energycapturing centers, driving such vital processes as respiration and photosynthesis. When atoms of the same element bond to one another, as in H2, the electron pairs of the outer shell are equally shared between the two bonded atoms. When two unlike atoms are covalently bonded, however, the positively charged nucleus of one atom exerts a greater attractive force on the outer electrons than the other. Consequently, the shared electrons tend to be located more closely to the atom with the greater attractive force, that is, the more electronegative atom. Among the atoms most commonly present in biological molecules, nitrogen and oxygen are strongly electronegative.

Polar and Nonpolar Molecules

Some atoms are so strongly electronegative that they can capture electrons from other atoms during a chemical reaction. For example, when the elements sodium (a silver-colored metal) and chlorine (a toxic gas) are mixed, the single electron in the outer shell of each sodium atom migrates to the electrondeficient chlorine atom. As a result, these two atoms are transformed into charged ions. 2 Na + Cl Cl

2 Na Cl

2 Na+ + 2 Cl –

Because the chloride ion has an extra electron (relative to the number of protons in its nucleus), it has a negative charge (Cl) and is termed an anion. The sodium atom, which has lost an electron, has an extra positive charge (Na) and is termed a cation. When present in crystals, these two ions form sodium chloride, or table salt. The Na and Cl ions depicted above are relatively stable because they possess filled outer shells. A different arrangement of electrons within an atom can produce a highly reactive species, called a free radical. The structure of free radicals and their importance in biology are considered in the accompanying Human Perspective.

REVIEW

Let’s examine a molecule of water. Water’s single oxygen atom attracts electrons much more forcefully than do either of its hydrogen atoms. As a result, the O—H bonds of a water molecule are said to be polarized, such that one of the atoms has a partial negative charge and the other a partial positive charge. This is generally denoted in the following manner: δ−

Ionization

Negatively charged end

1. Oxygen atoms have eight protons in their nucleus. How many electrons do they have? How many orbitals are in the inner electron shell? How many electrons are in the outer shell? How many more electrons can the outer shell hold before it is filled? 2. Compare and contrast: a sodium atom and a sodium ion; a double bond and a triple bond; an atom of weak and strong electronegativity; the electron distribution around an oxygen atom bound to another oxygen atom and an oxygen atom bound to two hydrogen atoms.

O H

Chapter 2 The Chemical Basis of Life

δ+

H δ+

2.2 | Noncovalent Bonds Positively charged ends

Molecules, such as water, that have an asymmetric distribution of charge (or dipole) are referred to as polar molecules. Polar molecules of biological importance contain one or more electronegative atoms, usually O, N, and/or S. Molecules that lack electronegative atoms and strongly polarized bonds, such as molecules that consist entirely of carbon and hydrogen atoms, are said to be nonpolar. The presence of strongly polarized bonds is of utmost importance in determining the reactivity of molecules. Large nonpolar molecules, such as waxes and fats, are relatively inert. Some of the more interesting biological molecules, including proteins and phospholipids, contain both polar and nonpolar regions, which behave very differently.

Covalent bonds are strong bonds between the atoms that make up a molecule. Interactions between molecules (or between different parts of a large biological molecule) are governed by a variety of weaker linkages called noncovalent bonds. Noncovalent bonds do not depend on shared electrons but rather on attractive forces between atoms having an opposite charge. Individual noncovalent bonds are weak (about 1 to 5 kcal/mol) and are thus readily broken and reformed. As will be evident throughout this book, this feature allows noncovalent bonds to mediate the dynamic interactions among molecules in the cell. Even though individual noncovalent bonds are weak, when large numbers of them act in concert, as between the two strands of a DNA molecule or between different parts of a large protein, their attractive forces are additive. Taken as a whole, they provide the structure with considerable stability.

35

T H E

H U M A N

P E R S P E C T I V E

Free Radicals as a Cause of Aging During the course of this textbook, we will discuss several different biological factors that are thought to contribute to the process of aging. Here we will consider one of these factors: the gradual accumulation of damage to our body’s tissues. The most destructive damage probably occurs to DNA. Alterations in DNA lead to the production of faulty genetic messages that promote gradual cellular deterioration. How does cellular damage occur, and why should it occur more rapidly in a shorter lived animal, such as a chimpanzee, than a human? The answer may reside at the atomic level. Atoms are stabilized when their shells are filled with electrons. Electron shells consist of orbitals, each of which can hold a maximum of two electrons. Atoms or molecules that have orbitals containing a single unpaired electron tend to be highly unstable—they are called free radicals. Free radicals may be formed when a covalent bond is broken such that each portion keeps one-half of the shared electrons, or they may be formed when an atom or molecule accepts a single electron transferred during an oxidation–reduction reaction. For example, water can be converted into free radicals when exposed to radiation from the sun:

H2O S HO # H # hydroxyl radical (“” indicates a free radical)

Free radicals are extremely reactive and capable of chemically altering many types of molecules, including proteins, nucleic acids, and lipids. This is illustrated by the fact that certain cells of the immune system generate free radicals within their cytoplasm as a means to kill bacteria that these immune cells have ingested. The formation of hydroxyl radicals is probably a major reason that sunlight is so damaging to skin. In 1956, Denham Harman of the University of Nebraska proposed that aging results from tissue damage caused by free radicals. Because the subject of free radicals was not one with which biologists and physicians were familiar, the proposal failed to generate significant interest. Then, in 1969, Joe McCord and Irwin Fridovich of Duke University discovered an enzyme, superoxide dismutase (SOD), whose sole function was the destruction of the superoxide radical (O2), a type of free radical formed when molecular oxygen picks up an extra electron. SOD catalyzes the following reaction:

Hydrogen peroxide is also a potentially reactive oxidizing agent, which is why it is often used as a disinfectant and bleaching agent. If it is not rapidly destroyed, H2O2 can break down to form hydroxyl radicals that attack the cell’s macromolecules. Hydrogen peroxide is normally destroyed in the cell by the enzymes catalase or glutathione peroxidase. Subsequent research has revealed that superoxide radicals are formed within cells during normal oxidative metabolism and that a superoxide dismutase is present in the cells of diverse organisms, from bacteria to humans. In fact, animals possess three different versions (isoforms) of SOD: a cytosolic, mitochondrial, and extracellular isoform. It is estimated that as much as 1–2 percent of the oxygen taken into human mitochondria can be converted to hydrogen peroxide rather than to water, the normal end product of respiration. The importance of SOD is most clearly revealed in studies of mutant bacteria and yeast that lack the enzyme; these cells are unable to grow in the presence of oxygen. Similarly, mice that are lacking the mitochondrial version of the enzyme (SOD2) are not able to survive more than a week or so after birth. Conversely, mice that have been genetically engineered so that their mitochondria contain elevated levels of the H2O2-destroying enzyme catalase live 20 percent longer, on average, than untreated controls. This finding, reported in 2005, marked the first demonstration that enhanced antioxidant defenses can increase the life span of a mammal. Although the destructive potential of free radicals, such as superoxide and hydroxyl radicals, is unquestioned, the importance of these agents as a factor in aging remains controversial. A related area of research concerns the study of substances called antioxidants that are able to destroy free radicals in the test tube. The sale of these substances provides a major source of revenue for the vitamin/supplements industry. Common antioxidants found in the body include glutathione, vitamins E and C, and betacarotene (the orange pigment in carrots and other vegetables). Although these substances may prove beneficial in the diet because of their ability to destroy free radicals, studies with rats and mice have failed to provide convincing evidence that they retard the aging process or increase maximum life span.

O2 # O2 # 2H S H2O2 O2 hydrogen peroxide

We will examine several types of noncovalent bonds that are important in cells.

A crystal of table salt is held together by an electrostatic attraction between positively charged Na and negatively charged Cl ions. This type of attraction between fully charged components is called an ionic bond (or a salt bridge). Ionic bonds within a salt crystal may be quite strong. How-

2.2 Noncovalent Bonds

Ionic Bonds: Attractions between Charged Atoms

ever, if a crystal of salt is dissolved in water, each of the individual ions becomes surrounded by water molecules, which inhibit oppositely charged ions from approaching one another closely enough to form ionic bonds (Figure 2.2). Because cells are composed primarily of water, bonds between free ions are of little importance. In contrast, weak ionic bonds between oppositely charged groups of large biological molecules are of considerable importance. For example, when negatively charged phosphate atoms in a DNA molecule are closely associated with positively charged groups on the surface of a

36

H

H O

H

Na+ Cl-

NaCl crystal

Na+ Cl-

Cl-

Na+

Na+ ClNa+

H

δ-

O

H

O

H

Na+

H H

O H

H

O

O H

Cl-

H

O H

Na+

H

H Cl-

O H O H

O

H

H

δ+ δ+ H

H Cl

H O

O H

--

H H

H

O

H

H O

Figure 2.2 The dissolution of a salt crystal. When placed in water, the Na and Cl ions of a salt crystal become surrounded by water molecules, breaking the ionic bonds between the two ions. As the salt dissolves, the negatively charged oxygen atoms of the water molecules associate with the positively charged sodium ions, and the positively charged hydrogen atoms of the water molecules associate with the negatively charged chloride ions.

C

O

protein (Figure 2.3), ionic bonds between them help hold the complex together. The strength of ionic bonds in a cell is generally weak (about 3 kcal/mol) due to the presence of water, but deep within the core of a protein, where water is often excluded, such bonds can be much stronger.

O

Ionic bond

P

O–

H N+

C

H

O

H

C

Chapter 2 The Chemical Basis of Life

Hydrogen Bonds When a hydrogen atom is covalently bonded to an electronegative atom, particularly an oxygen or a nitrogen atom, the single pair of shared electrons is greatly displaced toward the nucleus of the electronegative atom, leaving the hydrogen atom with a partial positive charge. As a result, the bare, positively charged nucleus of the hydrogen atom can approach near enough to an unshared pair of outer electrons of a second electronegative atom to form an attractive interaction (Figure 2.4). This weak attractive interaction is called a hydrogen bond. Hydrogen bonds occur between most polar molecules and are particularly important in determining the structure and properties of water (discussed later). Hydrogen bonds also form between polar groups present in large biological molecules, as occurs between the two strands of a DNA molecule (see Figure 2.3). Because their strength is additive, the large number of hydrogen bonds between the strands makes the DNA duplex a stable structure. However, because individual hydrogen bonds are weak (2–5 kcal/mol), the two strands can be partially separated to allow enzymes access to individual strands of the DNA molecule.

Hydrophobic Interactions and van der Waals Forces Because of their ability to interact with water, polar molecules, such as sugars and amino acids (described shortly), are said to

Protein Hydrogen bond C N

H

O

DNA

Figure 2.3 Noncovalent ionic bonds play an important role in holding the protein molecule on the right (yellow atoms) to the DNA molecule on the left. Ionic bonds form between positively charged nitrogen atoms in the protein and negatively charged oxygen atoms in the DNA. The DNA molecule itself consists of two separate strands held together by noncovalent hydrogen bonds. Although a single noncovalent bond is relatively weak and easily broken, large numbers of these bonds between two molecules, as between two strands of DNA, make the overall complex quite stable. (TOP IMAGE COURTESY OF STEPHEN HARRISON.)

be hydrophilic, or “water loving.” Nonpolar molecules, such as steroid or fat molecules, are essentially insoluble in water because they lack the charged regions that would attract them to the poles of water molecules. When nonpolar compounds are mixed with water, the nonpolar, hydrophobic (“water fearing”) molecules are forced into aggregates, which minimizes their exposure to the polar surroundings (Figure 2.5).

37 Hydrogen bond δ+

δ−

δ+

δ−

C

O

H

N

δ+

δ−

δ+

δ−

C

O

H

O

δ−

δ+

δ−

N

H

N

C C

Figure 2.4 Hydrogen bonds form between a bonded electronegative atom, such as nitrogen or oxygen, which bears a partial negative charge, and a bonded hydrogen atom, which bears a partial positive charge. Hydrogen bonds (about 0.18 nm) are typically about twice as long as the much stronger covalent bonds.

This association of nonpolar molecules is called a hydrophobic interaction. This is why droplets of fat molecules rapidly reappear on the surface of beef or chicken soup even after the liquid is stirred with a spoon. This is also the reason that nonpolar groups tend to localize within the interior of most soluble proteins away from the surrounding water molecules.

Hydrogen bond

Hydrophobic interactions of the type just described are not classified as true bonds because they do not result from an attraction between hydrophobic molecules.2 In addition to this type of interaction, hydrophobic groups can form weak bonds with one another based on electrostatic attractions. Polar molecules associate because they contain permanent asymmetric charge distributions within their structure. Closer examination of the covalent bonds that make up a nonpolar molecule (such as H2 or CH4) reveals that electron distributions are not always symmetric. The distribution of electrons around an atom at any given instant is a statistical matter and, therefore, varies from one instant to the next. Consequently, at any given time, the electron density may happen to be greater on one side of an atom, even though the atom shares the electrons equally with some other atom. These transient asymmetries in electron distribution result in momentary separations of charge (dipoles) within the molecule. If two molecules with transitory dipoles are very close to one another and oriented in the appropriate manner, they experience a weak attractive force, called a van der Waals force, that bonds them together. Moreover, the formation of a temporary separation of charge in one molecule can induce a similar separation in an adjacent molecule. In this way, additional attractive forces can be generated between nonpolar molecules. A single van der Waals force is very weak (0.1 to 0.3 kcal/mol) and very sensitive to the distance that separates the two atoms (Figure 2.6a). As we will see in later chapters, however, biological molecules that interact with one another, for example, an antibody and a protein on the surface of a virus, often possess complementary shapes. As a result, many atoms of both interactants may have the opportunity to approach each other very closely (Figure 2.6b), making van der Waals forces important in biological interactions.

The Life-Supporting Properties of Water Life on Earth is totally dependent on water, and water may be essential to the existence of life anywhere in the universe. Even though it contains only three atoms, a molecule of water has a unique structure that gives the molecule extraordinary properties.3 Most importantly, Hydrophobic interactions

1. Water is a highly asymmetric molecule with the O atom

at one end and the two H atoms at the opposite end. 2. Each of the two covalent bonds in the molecule is highly polarized. 2

2.2 Noncovalent Bonds

Figure 2.5 In a hydrophobic interaction, the nonpolar (hydrophobic) molecules are forced into aggregates, which minimizes their exposure to the surrounding water molecules.

This statement reflects an accepted hypothesis that hydrophobic interactions are driven by increased entropy (disorder). When a hydrophobic group projects into an aqueous solvent, the water molecules become ordered in a cage around the hydrophobic group. These solvent molecules become disordered when the hydrophobic group withdraws from the surrounding solvent. A discussion of this and other views can be found in Nature 437:640, 2005 and Curr. Opin. Struct. Biol. 16:152, 2006. 3 One way to appreciate the structure of water is by comparing it to H2S. Like oxygen, sulfur has six outer-shell electrons and forms single bonds with two hydrogen atoms. But because sulfur is a larger atom, it is less electronegative than oxygen, and its ability to form hydrogen bonds is greatly reduced. At room temperature, H2S is a gas, not a liquid. In fact, the temperature has to drop to 86C before H2S freezes into a solid.

38 Separation between centers of atoms (Å) 2 4 6

+

8

+ --

Hydrogen Oxygen

Repulsion 0 Attraction

Energy

Hydrogen bond

(a)

Optimal van der Waals interaction

van der Waals forces

(b)

Figure 2.6 Van der Waals forces. (a) As two atoms approach each other, they experience a weak attractive force that increases up to a specific distance, typically about 4 Å. If the atoms approach more closely, their electron clouds repel one another, causing the atoms to be forced apart. (b) Although individual van der Waals forces are very weak and transient, large numbers of such attractive forces can be formed if two macromolecules have a complementary surface, as is indicated schematically in this figure (see Figure 2.40 for an example).

3. All three atoms in a water molecule are adept at forming

Chapter 2 The Chemical Basis of Life

hydrogen bonds. The life-supporting attributes of water stem from these properties. Each molecule of water can form hydrogen bonds with as many as four other water molecules, producing a highly interconnected network of molecules (Figure 2.7). Each hydrogen bond is formed when the partially positive-charged hydrogen of one water molecule becomes aligned next to a partially negative-charged oxygen atom of another water molecule. Because of their extensive hydrogen bonding, water molecules have an unusually strong tendency to adhere to one another. This feature is most evident in the thermal properties of water. For example, when water is heated, most of the thermal energy is consumed in disrupting hydrogen bonds rather than contributing to molecular motion (which is measured as an increased temperature). Similarly, evaporation from the liquid to the gaseous state requires that water molecules break the hydrogen bonds holding them to their neighbors, which is why it takes so much energy to convert water to steam. Mam-

Figure 2.7 Hydrogen bond formation between neighboring water molecules. Each H atom of the molecule has about four-tenths of a full positive charge, and the single O atom has about eight-tenths of a full negative charge.

mals take advantage of this property when they sweat because the heat required to evaporate the water is absorbed from the body, which thus becomes cooler. The small volume of aqueous fluid present within a cell contains a remarkably complex mixture of dissolved substances, or solutes. In fact, water is able to dissolve more types of substances than any other solvent. But water is more than just a solvent; it determines the structure of biological molecules and the types of interactions in which they can engage. Water is the fluid matrix around which the insoluble fabric of the cell is constructed. It is also the medium through which materials move from one compartment of the cell to another; it is a reactant or product in many cellular reactions; and it protects the cell in many ways—from excessive heat, cold, or damaging radiation. Water is such an important factor in a cell because it is able to form weak interactions with so many different types of chemical groups. Recall from page 35 how water molecules, with their strongly polarized O—H bonds, form a shell around ions, separating the ions from one another. Similarly, water molecules form hydrogen bonds with organic molecules that contain polar groups, such as amino acids and sugars, which ensures their solubility within the cell. Water also plays a key role in maintaining the structure and function of macromolecules and the complexes that they form (such as membranes). Figure 2.8 shows the ordered arrangement of water molecules between two subunits of a protein molecule. The water molecules are hydrogen bonded to each other and to specific amino acids of the protein.

REVIEW 1. Describe some of the properties that distinguish covalent and noncovalent bonds. 2. Why do polar molecules, such as table sugar, dissolve so readily in water? Why do fat droplets form on the surface of an aqueous solution? Why does sweating help cool the body?

39

Table 2.1 Strengths of Acids and Bases Acids

Very weak Weak

H2O NH⫹4 H2S CH3COOH H2CO3 H3O⫹ HCl H2SO4

Strong

Subunit A

Subunit B Intersubunit water molecules

Figure 2.8 The importance of water in protein structure. The water molecules (each with a single red oxygen atom and two smaller gray hydrogen atoms) are shown in their ordered locations between the two subunits of a clam hemoglobin molecule. (FROM MARTIN CHAPLIN, NATURE REVS. MOL. CELL BIOL. 7:864, 2006, © 2006, BY MACMILLAN PUBLISHERS LIMITED.)

2.3 | Acids, Bases, and Buffers Protons are not only found within atomic nuclei, they are also released into the medium whenever a hydrogen atom loses a shared electron. Consider acetic acid—the distinctive ingredient of vinegar—which can undergo the following reaction, described as a dissociation. H O HC C H O H Acetic acid

H O + H+ HC C H O– Acetate ion

Proton (hydrogen ion)

A molecule that is capable of releasing (donating) a hydrogen ion is termed an acid. The proton released by the acetic acid molecule in the previous reaction does not remain in the free state; instead, it combines with another molecule. Possible reactions involving a proton include Combination with a water molecule to form a hydronium ion (H3O⫹). H ⫹ ⫹ H2O S H3O ⫹ ■

Combination with a hydroxyl ion (OH⫺) to form a molecule of water. H⫹ ⫹ OH⫺ S H2O

■

Combination with an amino group (—NH2) in a protein to form a charged amine. H⫹ ⫹ ¬NH2 S ¬NH ⫹ 3

OH NH3 S2⫺ CH3COO⫺ HCO⫺ 3 H2O Cl⫺ SO42⫺

Strong Weak

Very weak

Any molecule that is capable of accepting a proton is defined as a base. Acids and bases exist in pairs, or couples. When the acid loses a proton (as when acetic acid gives up a hydrogen ion), it becomes a base (in this case, acetate ion), which is termed the conjugate base of the acid. Similarly, when a base (such as an —NH2 group) accepts a proton, it forms an acid (in this case —NH⫹3), which is termed the conjugate acid of that base. Thus, the acid always contains one more positive charge than its conjugate base. Water is an example of an amphoteric molecule, that is, one that can serve both as an acid and a base: z H⫹ ⫹ H2O y z OH⫺ ⫹ H ⫹ H3O⫹ y Acid

Amphoteric molecule

Base

We will discuss another important group of amphoteric molecules, the amino acids, on page 50. Acids vary markedly with respect to the ease with which the molecule gives up a proton. The more readily the proton is lost, that is, the less strong the attraction of a conjugate base for its proton, the stronger the acid. Hydrogen chloride is a very strong acid, one that will readily transfer its proton to water molecules. The conjugate base of a strong acid, such as HCl, is a weak base (Table 2.1). Acetic acid, in contrast, is a relatively weak acid because for the most part it remains undissociated when dissolved in water. In a sense, one can consider the degree of dissociation of an acid in terms of the competition for protons among the components of a solution. Water is a better competitor, that is, a stronger base, than chloride ion, so HCl completely dissociates. In contrast, acetate ion is a stronger base than water, so it remains largely as undissociated acetic acid. The acidity of a solution is measured by the concentration of hydrogen ions4 and is expressed in terms of pH. pH ⫽ ⫺log [H ⫹ ] where [H⫹] is the molar concentration of protons. For example, a solution having a pH of 5 contains a hydrogen ion concentration of 10⫺5 M. Because the pH scale is logarithmic, an increase of one pH unit corresponds to a tenfold decrease in H⫹ concentration (or a tenfold increase in OH⫺ concentration). Stomach juice (pH 1.8), for example, has nearly one million times the H⫹ concentration of blood (pH 7.4). In aqueous solutions, protons do not exist in the free state, but rather as H3O⫹ or H5O⫹ 2 . For the sake of simplicity, we will refer to them simply as protons or hydrogen ions. 4

2.3 Acids, Bases, and Buffers

■

Bases ⫺

40

When a water molecule dissociates into a hydroxyl ion and a proton, H2O S H OH , the equilibrium constant for the reaction can be expressed as: [H ][OH ] H2O Because the concentration of pure water is always 55.51 M, we can generate a new constant, KW, the ion–product constant for water, K eq

Kw [H ][OH ] which is equal to 1014 at 25C. In pure water, the concentration of both H and OH is approximately 107 M. The extremely low level of dissociation of water indicates that it is a very weak acid. In the presence of an acid, the concentration of hydrogen ions rises and the concentration of hydroxyl ions drops (as a result of combination with protons to form water), so that the ion product remains at 1014. Most biological processes are acutely sensitive to pH because changes in hydrogen ion concentration affect the ionic state of biological molecules. For example, as the hydrogen ion concentration increases, the —NH2 group of the amino acid arginine becomes protonated to form —NH 3 , which can disrupt the activity of the entire protein. Even slight changes in pH can impede biological reactions. Organisms, and the cells they comprise, are protected from pH fluctuations by buffers—compounds that react with free hydrogen or hydroxyl ions, thereby resisting changes in pH. Buffer solutions usually contain a weak acid together with its conjugate base. Blood, for example, is buffered by carbonic acid and bicarbonate ions, which normally hold blood pH at about 7.4. z HCO 3 H y H2CO3 Bicarbonate Hydrogen Carbonic ion ion acid

If the hydrogen ion concentration rises (as occurs during exercise), the bicarbonate ions combine with the excess protons, removing them from solution. Conversely, excess OH ions (which are generated during hyperventilation) are neutralized by protons derived from carbonic acid. The pH of the fluid within the cell is regulated in a similar manner by a phosphate 2 buffer system consisting of H2PO 4 and HPO4 .

thought that carbon-containing molecules were present only in living organisms and thus were referred to as organic molecules to distinguish them from inorganic molecules found in the inanimate world. As chemists learned to synthesize more and more of these carbon-containing molecules in the lab, the mystique associated with organic compounds disappeared. The compounds produced by living organisms are called biochemicals. The chemistry of life centers around the chemistry of the carbon atom. The essential quality of carbon that has allowed it to play this role is the incredible number of molecules it can form. Having four outer-shell electrons, a carbon atom can bond with up to four other atoms. Most importantly, each carbon atom is able to bond with other carbon atoms so as to construct molecules with backbones containing long chains of carbon atoms. Carbon-containing backbones may be linear, branched, or cyclic. C

C

C

C

C

C

C

Chapter 2 The Chemical Basis of Life

The bulk of an organism is water. If the water is evaporated away, most of the remaining dry weight consists of molecules containing atoms of carbon. When first discovered, it was

C

C

C

C

C

C

C

C

C

C

C

C Linear

Cyclic

Branched

Cholesterol, whose structure is depicted in Figure 2.9, illustrates various arrangements of carbon atoms. Both the size and electronic structure of carbon make it uniquely suited for generating large numbers of molecules, several hundred thousand of which are known. In contrast, silicon, which is just below carbon in the periodic table and also has four outer-shell electrons (see Figure 2.1), is too large for its positively charged nucleus to attract the outer-shell electrons of neighboring atoms with sufficient force to hold such large molecules together. We can best understand

H H

H H

C

Cholesterol H H H H

H

C

C

H H

H

H

H

O

C

C

C

C

H

H H

C

C

H

H

C H

H

H

C

C

H

C

H

C

C

H

C

C

H

H

H

H

H

H

H

C

C

C

C

C

C

C

H

H

H

H

H

H H

C

H

REVIEW

2.4 | The Nature of Biological Molecules

C

C

H

1. If you were to add hydrochloric acid to water, what effect would this have on the hydrogen ion concentration? on the pH? on the ionic charge of any proteins in solution? 2. What is the relationship between a base and its conjugate acid?

C

C

C H

C H

H

H

H

Figure 2.9 Cholesterol, whose structure illustrates how carbon atoms (represented by the black balls) are able to form covalent bonds with as many as four other carbon atoms. As a result, carbon atoms can be linked together to form the backbones of a virtually unlimited variety of organic molecules. The carbon backbone of a cholesterol molecule includes four rings, which is characteristic of steroids (e.g., estrogen, testosterone, cortisol). The cholesterol molecule shown here is drawn as a ball-and-stick model, which is another way that molecular structure is depicted.

41

the nature of biological molecules by starting with the simplest group of organic molecules, the hydrocarbons, which contain only carbon and hydrogen atoms. The molecule ethane (C2H6) is a simple hydrocarbon

substitution of various functional groups is readily demonstrated. The hydrocarbon ethane (CH3CH3) depicted above is a toxic, flammable gas. Replace one of the hydrogens with a hydroxyl group (—OH) and the molecule (CH3CH2OH) becomes palatable—it is ethyl alcohol. Substitute a carboxyl group (—COOH) and the molecule becomes acetic acid (CH3COOH), the strong-tasting ingredient in vinegar. Substitute a sulfhydryl group (—SH), and you have formed CH3CH2SH, a strong, foul-smelling agent, ethyl mercaptan, used by biochemists in studying enzyme reactions.

H H H

C

C

H

H

H

Ethane

consisting of two atoms of carbon in which each carbon is bonded to the other carbon as well as to three atoms of hydrogen. As more carbons are added, the skeletons of organic molecules increase in length and their structure becomes more complex.

A Classification of Biological Molecules by Function The organic molecules commonly found within living cells can be divided into several categories based on their role in metabolism.

Functional Groups

1. Macromolecules. The molecules that form the structure

O C

O OH + HO

Acid

C

C

O

Alcohol

C

Ester

O C

O OH + HN

Acid

C

C

Amine

N

C

Amide

Most of the groups in Table 2.2 contain one or more electronegative atoms (N, P, O, and/or S) and make organic molecules more polar, more water soluble, and more reactive. Several of these functional groups can ionize and become positively or negatively charged. The effect on molecules by the

Table 2.2 Functional Groups H C

O H

O

H

C

N O

H Methyl

Hydroxyl

H

Carboxyl

H

H Amino

O P

O

O

H

H

Phosphate

C

O

Carbonyl

S

H

Sulfhydryl

2.4 The Nature of Biological Molecules

and carry out the activities of cells are huge, highly organized molecules called macromolecules, which contain anywhere from dozens to millions of carbon atoms. Because of their size and the intricate shapes that macromolecules can assume, some of these molecular giants can perform complex tasks with great precision and efficiency. The presence of macromolecules, more than any other characteristic, endows organisms with the properties of life and sets them apart chemically from the inanimate world. Macromolecules can be divided into four major categories: proteins, nucleic acids, polysaccharides, and certain lipids. The first three types are polymers composed of a large number of low-molecular-weight building blocks, or monomers. These macromolecules are constructed from monomers by a process of polymerization that resembles coupling railroad cars onto a train (Figure 2.10). The basic structure and function of each type of macromolecule are similar in all organisms. It is not until you look at the specific sequences of monomers that make up individual macromolecules that the diversity among organisms becomes apparent. 2. The building blocks of macromolecules. Most of the macromolecules within a cell have a short lifetime compared with the cell itself; with the exception of the cell’s DNA, they are continually broken down and replaced by new macromolecules. Consequently, most cells contain a supply (or pool) of low-molecular-weight precursors that are ready to be incorporated into macromolecules. These

Hydrocarbons do not occur in significant amounts within most living cells (though they constitute the bulk of the fossil fuels formed from the remains of ancient plants and animals). Many of the organic molecules that are important in biology contain chains of carbon atoms, like hydrocarbons, but certain hydrogen atoms are replaced by various functional groups. Functional groups are particular groupings of atoms that often behave as a unit and give organic molecules their physical properties, chemical reactivity, and solubility in aqueous solution. Some of the more common functional groups are listed in Table 2.2. Two of the most common linkages between functional groups are ester bonds, which form between carboxylic acids and alcohols, and amide bonds, which form between carboxylic acids and amines.

42 Carrier

Growing end of polymer

Polymer with added subunit

Monomer

+

Carrier Recycled Free carrier

Monomer (a)

Hydrolysis H + + OH

(b)

H 20

Figure 2.10 Monomers and polymers; polymerization and hydrolysis. (a) Polysaccharides, proteins, and nucleic acids consist of monomers (subunits) linked together by covalent bonds. Free monomers do not simply react with each other to become macromolecules. Rather, each monomer is first activated by attachment

Chapter 2 The Chemical Basis of Life

+

_

include sugars, which are the precursors of polysaccharides; amino acids, which are the precursors of proteins; nucleotides, which are the precursors of nucleic acids; and fatty acids, which are incorporated into lipids. 3. Metabolic intermediates (metabolites). The molecules in a cell have complex chemical structures and must be synthesized in a step-by-step sequence beginning with specific starting materials. In the cell, each series of chemical reactions is termed a metabolic pathway. The cell starts with compound A and converts it to compound B, then to compound C, and so on, until some functional end product (such as an amino acid building block of a protein) is produced. The compounds formed along the pathways leading to the end products might have no function per se and are called metabolic intermediates. 4. Molecules of miscellaneous function. This is obviously a broad category of molecules but not as large as you might expect; the vast bulk of the dry weight of a cell is made up of macromolecules and their direct precursors. The molecules of miscellaneous function include such substances as vitamins, which function primarily as adjuncts to proteins; certain steroid or amino acid hormones; molecules

to a carrier molecule that subsequently transfers the monomer to the end of the growing macromolecule. (b) A macromolecule is disassembled by hydrolysis of the bonds that join the monomers together. Hydrolysis is the splitting of a bond by water. All of these reactions are catalyzed by specific enzymes.

involved in energy storage, such as ATP; regulatory molecules such as cyclic AMP; and metabolic waste products such as urea.

REVIEW 1. What properties of a carbon atom are critical to life? 2. Draw the structures of four different functional groups. How would each of these groups alter the solubility of a molecule in water?

2.5 | Four Types of Biological Molecules The macromolecules just described can be divided into four types of organic molecules: carbohydrates, lipids, proteins, and nucleic acids. The localization of these molecules in a number of cellular structures is shown in an overview in Figure 2.11.

43 Chromatin in nucleus

Carbohydrates

Cell wall

Carbohydrates (or glycans as they are often called) include simple sugars (or monosaccharides) and all larger molecules constructed of sugar building blocks. Carbohydrates function primarily as stores of chemical energy and as durable building materials for biological construction. Most sugars have the general formula (CH2O)n. The sugars of importance in cellular metabolism have values of n that range from 3 to 7. Sugars containing three carbons are known as trioses, those with four carbons as tetroses, those with five carbons as pentoses, those with six carbons as hexoses, and those with seven carbons as heptoses.

Carbohydrate

DNA Protein

Carbohydrate

Carbohydrate

Protein

Starch grain in chloroplast

Lipid

Plasma membrane

The Structure of Simple Sugars Each sugar molecule consists of a backbone of carbon atoms linked together in a linear array by single bonds. Each of the carbon atoms of the backbone is linked to a single hydroxyl group, except for one that bears a carbonyl (CPO) group. If the carbonyl group is located at an internal position (to form a ketone group), the sugar is a ketose, such as fructose, which is shown in Figure 2.12a. If the carbonyl is located at one end of the sugar, it forms an aldehyde group and the molecule is known as an aldose, as exemplified by glucose, which is shown in Figure 2.12b–f. Because of their large numbers of hydroxyl groups, sugars tend to be highly water soluble. Although the straight-chained formulas shown in Figure 2.12a,b are useful for comparing the structures of various sugars, they do not reflect the fact that sugars with five or more carbon atoms undergo a self-reaction (Figure 2.12c) that converts them into a closed, or ring-containing, molecule. The ring forms of sugars are usually depicted as flat (planar) structures (Figure 2.12d) lying perpendicular to the plane of the paper with the thickened line situated closest to the reader. The H and OH groups lie parallel to the plane of the paper,

Protein RNA

Protein

Ribosome

DNA

Microtubules

Lipid Protein

Protein

Mitochondrion

Carbohydrate Lipid DNA RNA

Figure 2.11 An overview of the types of biological molecules that make up various cellular structures.

H H

C C

HO

C

H C

OH O H

H HO

C C

6

O OH H

5

H

C

C

H OH

4

H

C

OH

H

C

OH

C

OH

H

C

OH

H

C

OH

H

C

OH

HO

H

H D-Glucose

(a)

(b)

H

H

H C

4

1

O 3C

H

D-Fructose

OH

O

CH2OH

HO

2C

5

H

O H

H OH

H

3

2

H

1

OH

OH

HO

CH2OH O H H

HO

HO OH

H

H

H H H

H

O H

C O

C

C H

C

O

H

C O

H

H

OH

H

H

O

D-Glucose

α-D-Glucose

α-D-Glucose

α-D-Glucose

(Ring Formation)

(Haworth projection)

(Chair form)

(Ball-and-stick chair)

(c)

(d)

(e)

(f)

Figure 2.12 The structures of sugars. (a) Straight-chain formula of fructose, a ketohexose [keto, indicating the carbonyl (yellow), is located internally, and hexose because it consists of six carbons]. (b) Straightchain formula of glucose, an aldohexose (aldo because the carbonyl is located at the end of the molecule). (c) Self-reaction in which glucose is converted from an open chain to a closed ring (a pyranose ring). (d ) Glucose is commonly depicted in the form of a flat (planar) ring

H

C

lying perpendicular to the page with the thickened line situated closest to the reader and the H and OH groups projecting either above or below the ring. The basis for the designation ␣-D-glucose is discussed in the following section. (e) The chair conformation of glucose, which depicts its three-dimensional structure more accurately than the flattened ring of part d. ( f ) A ball-and-stick model of the chair conformation of glucose.

2.5 Four Types of Biological Molecules

H

6

CH2OH

44

projecting either above or below the ring of the sugar. In actual fact, the sugar ring is not a planar structure, but usually exists in a three-dimensional conformation resembling a chair (Figure 2.12e,f ). Stereoisomerism As noted earlier, a carbon atom can bond with four other atoms. The arrangement of the groups around a carbon atom can be depicted as in Figure 2.13a with the carbon placed in the center of a tetrahedron and the bonded groups projecting into its four corners. Figure 2.13b depicts a molecule of glyceraldehyde, which is the only aldotriose. The second carbon atom of glyceraldehyde is linked to four different groups (—H, —OH, —CHO, and —CH2OH). If the four groups bonded to a carbon atom are all different, as in glyceraldehyde, then two possible configua

C

d

b c

(a)

CHO

CHO

Mirror

C

H

CH2OH

C

H OH

CH2OH

OH

(b) CHO

CHO H

C

OH

OH

CH2OH D

-Glyceraldehyde

C

H

CH2OH -Glyceraldehyde

L

(c)

Chapter 2 The Chemical Basis of Life

rations exist that cannot be superimposed on one another. These two molecules (termed stereoisomers or enantiomers) have essentially the same chemical reactivities, but their structures are mirror images (not unlike a pair of right and left human hands). By convention, the molecule is called D-glyceraldehyde if the hydroxyl group of carbon 2 projects to the right, and Lglyceraldehyde if it projects to the left (Figure 2.13c). Because it acts as a site of stereoisomerism, carbon 2 is referred to as an asymmetric carbon atom. As the backbone of sugar molecules increases in length, so too does the number of asymmetric carbon atoms and, consequently, the number of stereoisomers. Aldotetroses have two asymmetric carbons and thus can exist in four different configurations (Figure 2.14). Similarly, there are 8 different aldopentoses and 16 different aldohexoses. The designation of each of these sugars as D or L is based by convention on the arrangement of groups attached to the asymmetric carbon atom farthest from the aldehyde (the carbon associated with the aldehyde is designated C1). If the hydroxyl group of this carbon projects to the right, the aldose is a D-sugar; if it projects to the left, it is an L-sugar. The enzymes present in living cells can distinguish between the D and L forms of a sugar. Typically, only one of the stereoisomers (such as D-glucose and L-fucose) is used by cells. The self-reaction in which a straight-chain glucose molecule is converted into a six-membered (pyranose) ring was shown in Figure 2.12c. Unlike its precursor in the open chain, the C1 of the ring bears four different groups and thus becomes a new center of asymmetry within the sugar molecule. Because of this extra asymmetric carbon atom, each type of pyranose exists as and stereoisomers (Figure 2.15). By convention, the molecule is an -pyranose when the OH group of the first carbon projects below the plane of the ring, and a -pyranose when the hydroxyl projects upward. The difference between the two forms has important biological consequences, resulting, for example, in the compact shape of glycogen and starch molecules and the extended conformation of cellulose (discussed later).

Figure 2.13 Stereoisomerism of glyceraldehyde. (a) The four groups bonded to a carbon atom (labeled a, b, c, and d) occupy the four corners of a tetrahedron with the carbon atom at its center. (b) Glyceraldehyde is the only three-carbon aldose; its second carbon atom is bonded to four different groups (—H, —OH, —CHO, and —CH2OH). As a result, glyceraldehyde can exist in two possible configurations that are not superimposable, but instead are mirror images of each other as indicated. These two stereoisomers (or enantiomers) can be distinguished by the configuration of the four groups around the asymmetric (or chiral) carbon atom. Solutions of these two isomers rotate planepolarized light in opposite directions and, thus, are said to be optically active. (c) Straight-chain formulas of glyceraldehyde. By convention, the D-isomer is shown with the OH group on the right.

Linking Sugars Together Sugars can be joined to one another by covalent glycosidic bonds to form larger molecules. Glycosidic bonds form by reaction between carbon atom C1 of one sugar and the hydroxyl group of another sugar, generating a —C—O—C— linkage between the two sugars. As discussed below (and indicated in Figures 2.16 and 2.17), sugars can be joined by quite a variety of different glycosidic

CHO HCOH HCOH CH2OH D-Erythrose

CHO HOCH HCOH CH2OH D-Threose

CHO HCOH HOCH CH2OH L-Threose

CHO HOCH HOCH CH2OH L-Erythrose

Figure 2.14 Aldotetroses. Because they have two asymmetric carbon atoms, aldotetroses can exist in four configurations.

45 6

CH2OH

CH2OH H

5

O OH H OH

H

C

C

H OH

4

HO

H H

H

HO

OH

OH H

CH2OH H C

H 1

O 3C

H β-D-Glucopyranose

2C

O H H OH

H

HO

OH H

OH

OH α-D-Glucopyranose

Figure 2.15 Formation of an ␣- and ␤-pyranose. When a molecule of glucose undergoes self-reaction to form a pyranose ring (i.e., a sixmembered ring), two stereoisomers are generated. The two isomers are in equilibrium with each other through the open-chain form of the

molecule. By convention, the molecule is an -pyranose when the OH group of the first carbon projects below the plane of the ring, and a pyranose when the hydroxyl group projects upward.

bonds. Molecules composed of only two sugar units are disaccharides (Figure 2.16). Disaccharides serve primarily as readily available energy stores. Sucrose, or table sugar, is a major component of plant sap, which carries chemical energy from one part of the plant to another. Lactose, present in the milk of most mammals, supplies newborn mammals with fuel for early growth and development. Lactose in the diet is hydrolyzed by the enzyme lactase, which is present in the plasma membranes of the cells that line the intestine. Many people lose this enzyme after childhood and find that eating dairy products causes digestive discomfort. Sugars may also be linked together to form small chains called oligosaccharides (oligo few). Most often such chains are found covalently attached to lipids and proteins, converting them into glycolipids and glycoproteins, respectively. Oligosaccharides are particularly important on the glycolipids and glycoproteins of the plasma membrane, where they project from the cell surface (see Figure 4.4c). Because oligosaccharides may

be composed of many different combinations of sugar units, these carbohydrates can play an informational role; that is, they can serve to distinguish one type of cell from another and help mediate specific interactions of a cell with its surroundings.

Sucrose 6

CH2OH

H

5

H OH

4

HO

3

H

1

HOCH2

O H H

1 (α)

2

H

O

2

H

O HO

3

OH

OH

5 6

CH2OH

4

H

(a) 6

CH2OH

HO 4

H

5

H OH 3

H

CH2OH

H

O H

1 (β)

2

OH

H

O

4

5

H OH 3

H

O OH H

1

2

H

OH

(b)

Figure 2.16 Disaccharides. Sucrose and lactose are two of the most common disaccharides. Sucrose is composed of glucose and fructose joined by an (1 → 2) linkage, whereas lactose is composed of glucose and galactose joined by a (1 → 4) linkage.

Glycogen and Starch: Nutritional Polysaccharides Glycogen is a branched polymer containing only one type of monomer: glucose (Figure 2.17a). Most of the sugar units of a glycogen molecule are joined to one another by (1 → 4) glycosidic bonds (type 2 bond in Figure 2.17a). Branch points contain a sugar joined to three neighboring units rather than to two, as in the unbranched segments of the polymer. The extra neighbor, which forms the branch, is linked by an (1 → 6) glycosidic bond (type 1 bond in Figure 2.17a).

2.5 Four Types of Biological Molecules

Lactose 6

Polysaccharides By the middle of the nineteenth century, it was known that the blood of people suffering from diabetes had a sweet taste due to an elevated level of glucose, the key sugar in energy metabolism. Claude Bernard, a prominent French physiologist of the period, was looking for the cause of diabetes by investigating the source of blood sugar. It was assumed at the time that any sugar present in a human or an animal had to have been previously consumed in the diet. Working with dogs, Bernard found that, even if the animals were placed on a diet totally lacking carbohydrates, their blood still contained a normal amount of glucose. Clearly, glucose could be formed in the body from other types of compounds. After further investigation, Bernard found that glucose enters the blood from the liver. Liver tissue, he found, contains an insoluble polymer of glucose he named glycogen. Bernard concluded that various food materials (such as proteins) were carried to the liver where they were chemically converted to glucose and stored as glycogen. Then, as the body needed sugar for fuel, the glycogen in the liver was transformed to glucose, which was released into the bloodstream to satisfy glucose-depleted tissues. In Bernard’s hypothesis, the balance between glycogen formation and glycogen breakdown in the liver was the prime determinant in maintaining the relatively constant (homeostatic) level of glucose in the blood. Bernard’s hypothesis proved to be correct. The molecule he named glycogen is a type of polysaccharide—a polymer of sugar units joined by glycosidic bonds.

46

1

Glycogen (a)

2

2

(b)

Starch

3

Chapter 2 The Chemical Basis of Life

(c)

Cellulose

Figure 2.17 Three polysaccharides with identical sugar monomers but dramatically different properties. Glycogen (a), starch (b), and cellulose (c) are each composed entirely of glucose subunits, yet their chemical and physical properties are very different due to the distinct ways that the monomers are linked together (three different types of linkages are indicated by the circled numbers). Glycogen molecules are the most highly branched, starch molecules assume a helical arrangement, and cellulose molecules are unbranched and highly

extended. Whereas glycogen and starch are energy stores, cellulose molecules are bundled together into tough fibers that are suited for their structural role. Colorized electron micrographs show glycogen granules in a liver cell, starch grains (amyloplasts) in a plant seed, and cellulose fibers in a plant cell wall; each is indicated by an arrow. [PHOTO INSETS: (TOP) DON FAWCETT/PHOTO RESEARCHERS, INC.; (CENTER) JEREMY BURGESS/PHOTO RESEARCHERS, INC.; (BOTTOM) BIOPHOTO ASSOCIATES/PHOTO RESEARCHERS, INC.]

Glycogen serves as a storehouse of surplus chemical energy in most animals. Human skeletal muscles, for example, typically contain enough glycogen to fuel about 30 minutes of moderate activity. Depending on various factors, glycogen typically ranges in molecular weight from about one to four million daltons. When stored in cells, glycogen is highly concentrated in what appears as dark-staining, irregular granules in electron micrographs (Figure 2.17a, right).

Most plants bank their surplus chemical energy in the form of starch, which like glycogen is also a polymer of glucose. Potatoes and cereals, for example, consist primarily of starch. Starch is actually a mixture of two different polymers, amylose and amylopectin. Amylose is an unbranched, helical molecule whose sugars are joined by (1 → 4) linkages (Figure 2.17b), whereas amylopectin is branched. Amylopectin differs from glycogen in being much less branched and having

47

an irregular branching pattern. Starch is stored as densely packed granules, or starch grains, which are enclosed in membranebound organelles (plastids) within the plant cell (Figure 2.17b, right). Although animals don’t synthesize starch, they possess an enzyme (amylase) that readily hydrolyzes it. Cellulose, Chitin, and Glycosaminoglycans: Structural Polysaccharides Whereas some polysaccharides constitute easily digested energy stores, others form tough, durable structural materials. Cotton and linen, for example, consist largely of cellulose, which is the major component of plant cell walls. Cotton textiles owe their durability to the long, unbranched cellulose molecules, which are ordered into side-by-side aggregates to form molecular cables (Figure 2.17c, right panel) that are ideally constructed to resist pulling (tensile) forces. Like glycogen and starch, cellulose consists solely of glucose monomers; its properties differ dramatically from these other polysaccharides because the glucose units are joined by (1 → 4) linkages (bond 3 in Figure 2.17c) rather than (1 → 4) linkages. Ironically, multicellular animals (with rare exception) lack the enzyme needed to degrade cellulose, which happens to be the most abundant organic material on Earth and rich in chemical energy. Animals that “make a living” by digesting cellulose, such as termites and sheep, do so by harboring bacteria and protozoa that synthesize the necessary enzyme, cellulase. Not all biological polysaccharides consist of glucose monomers. Chitin is an unbranched polymer of the sugar N-acetylglucosamine, which is similar in structure to glucose but has an acetyl amino group instead of a hydroxyl group bonded to the second carbon atom of the ring. CH2OH H

O H H OH

HO

H OH

H

HNCOCH3

N -Acetylglucosamine

are found in the spaces that surround cells, and their structure and function are discussed in Section 7.1. The most complex polysaccharides are found in plant cell walls (Section 7.6).

Lipids Lipids are a diverse group of nonpolar biological molecules whose common properties are their ability to dissolve in organic solvents, such as chloroform or benzene, and their inability to dissolve in water—a property that explains many of their varied biological functions. Lipids of importance in cellular function include fats, steroids, and phospholipids. Fats Fats consist of a glycerol molecule linked by ester bonds to three fatty acids; the composite molecule is termed a triacylglycerol (Figure 2.19a). We will begin by considering the structure of fatty acids. Fatty acids are long, unbranched hydrocarbon chains with a single carboxyl group at one end (Figure 2.19b). Because the two ends of a fatty acid molecule have a very different structure, they also have different properties. The hydrocarbon chain is hydrophobic, whereas the carboxyl group (—COOH), which bears a negative charge at physiological pH, is hydrophilic. Molecules having both hydrophobic and hydrophilic regions are said to be amphipathic; such molecules have unusual and biologically important properties. The properties of fatty acids can be appreciated by considering the use of a familiar product: soap, which consists of fatty acids. In past centuries, soaps were made by heating animal fat in strong alkali (NaOH or KOH) to break the bonds between the fatty acids and the glycerol.

2.5 Four Types of Biological Molecules

Chitin occurs widely as a structural material among invertebrates, particularly in the outer covering of insects, spiders, and crustaceans. Chitin is a tough, resilient, yet flexible material not unlike certain plastics. Insects owe much of their success to this highly adaptive polysaccharide (Figure 2.18). Another group of polysaccharides that has a more complex structure is the glycosaminoglycans (or GAGs). Unlike other polysaccharides, they have the structure —A—B—A—B—, where A and B represent two different sugars. The best-studied GAG is heparin, which is secreted by cells in the lungs and other tissues in response to tissue injury. Heparin inhibits blood coagulation, thereby preventing the formation of clots that can block the flow of blood to the heart or lungs. Heparin accomplishes this feat by activating an inhibitor (antithrombin) of a key enzyme (thrombin) that is required for blood coagulation. Heparin, which is normally extracted from pig tissue, has been used for decades to prevent blood clots in patients following major surgery. Unlike heparin, most GAGs

Figure 2.18 Chitin is the primary component of the outer skeleton of this grasshopper. (ANTHONY BANNISTER/GALLO IMAGES/ © CORBIS)

48 Glycerol moiety CH2

Fatty acid tail

O

O

Water

C O

CH

O

C O

CH2

O

C

(a)

HO

O

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

Stearic acid (b)

Figure 2.20 Soaps consist of fatty acids. In this schematic drawing of a soap micelle, the nonpolar tails of the fatty acids are directed inward, where they interact with the greasy matter to be dissolved. The negatively charged heads are located at the surface of the micelle, where they interact with the surrounding water. Membrane proteins, which also tend to be insoluble in water, can also be solubilized in this way by extraction of membranes with detergents. Tristearate (c)

Chapter 2 The Chemical Basis of Life

Linseed oil (d)

Figure 2.19 Fats and fatty acids. (a) The basic structure of a triacylglycerol (also called a triglyceride or a neutral fat). The glycerol moiety, indicated in orange, is linked by three ester bonds to the carboxyl groups of three fatty acids whose tails are indicated in green. (b) Stearic acid, an 18-carbon saturated fatty acid that is common in animal fats. (c) Space-filling model of tristearate, a triacylglycerol containing three identical stearic acid chains. (d ) Space-filling model of linseed oil, a triacylglycerol derived from flax seeds that contains three unsaturated fatty acids (linoleic, oleic, and linolenic acids). The sites of unsaturation, which produce kinks in the molecule, are indicated by the yellow-orange bars.

Today, most soaps are made synthetically. Soaps owe their grease-dissolving capability to the fact that the hydrophobic end of each fatty acid can embed itself in the grease, whereas the hydrophilic end can interact with the surrounding water.

As a result, greasy materials are converted into complexes (micelles) that can be dispersed by water (Figure 2.20). Fatty acids differ from one another in the length of their hydrocarbon chain and the presence or absence of double bonds. Fatty acids present in cells typically vary in length from 14 to 20 carbons. Fatty acids that lack double bonds, such as stearic acid (Figure 2.19b), are described as saturated; those possessing double bonds are unsaturated. Naturally occurring fatty acids have double bonds in the cis configuration. Double bonds (of the cis configuration) H

H C

C

C

as opposed to C

cis

C

H C

C

H

C trans

produce kinks in a fatty acid chain. Consequently, the more double bonds that fatty acid chains possess, the less effectively these long chains can be packed together. This lowers the temperature at which a fatty acid-containing lipid melts. Tristearate, whose fatty acids lack double bonds (Figure 2.19c), is a common component of animal fats and remains in a solid state well above room temperature. In contrast, the profusion of double bonds in vegetable fats accounts for their liquid state—both in the plant cell and on the grocery shelf—and for their being labeled as “polyunsaturated.” Fats that are liquid at room temperature are described as oils. Figure 2.19d shows

49

the structure of linseed oil, a highly volatile lipid extracted from flax seeds, that remains a liquid at a much lower temperature than does tristearate. Solid shortenings, such as margarine, are formed from unsaturated vegetable oils by chemically reducing the double bonds with hydrogen atoms (a process termed hydrogenation). The hydrogenation process also converts some of the cis double bonds into trans double bonds, which are straight rather than kinked. This process generates partially hydrogenated or trans-fats. A molecule of fat can contain three identical fatty acids (as in Figure 2.19c), or it can be a mixed fat, containing more than one fatty acid species (as in Figure 2.19d ). Most natural fats, such as olive oil or butterfat, are mixtures of molecules having different fatty acid species. Fats are very rich in chemical energy; a gram of fat contains over twice the energy content of a gram of carbohydrate (for reasons discussed in Section 3.1). Carbohydrates function primarily as a short-term, rapidly available energy source, whereas fat reserves store energy on a long-term basis. It is estimated that a person of average size contains about 0.5 kilograms (kg) of carbohydrate, primarily in the form of glycogen. This amount of carbohydrate provides approximately 2000 kcal of total energy. During the course of a strenuous day’s exercise, a person can virtually deplete his or her body’s entire store of carbohydrate. In contrast, the average person contains approximately 16 kg of fat (equivalent to 144,000 kcal of energy), and as we all know, it can take a very long time to deplete our store of this material.

CH3 CH3 CH3

CH3

Because they lack polar groups, fats are extremely insoluble in water and are stored in cells in the form of dry lipid droplets. Since lipid droplets do not contain water as do glycogen granules, they represent an extremely concentrated storage fuel. In many animals, fats are stored in special cells (adipocytes) whose cytoplasm is filled with one or a few large lipid droplets. Adipocytes exhibit a remarkable ability to change their volume to accommodate varying quantities of fat. Steroids Steroids are built around a characteristic fourringed hydrocarbon skeleton. One of the most important steroids is cholesterol, a component of animal cell membranes and a precursor for the synthesis of a number of steroid hormones, such as testosterone, progesterone, and estrogen (Figure 2.21). Cholesterol is largely absent from plant cells, which is why vegetable oils are considered “cholesterol-free,” but plant cells may contain large quantities of related compounds. Phospholipids The chemical structure of a common phospholipid is shown in Figure 2.22. The molecule resembles a fat (triacylglycerol), but has only two fatty acid chains rather than three; it is a diacylglycerol. The third hydroxyl of the glycerol backbone is covalently bonded to a phosphate group, which in turn is covalently bonded to a small polar group, such as choline, as shown in Figure 2.22. Thus, unlike fat molecules, phospholipids contain two ends that have very different properties: the end containing the phosphate group has a distinctly hydrophilic character; the other end composed of the two fatty acid tails has a distinctly hydrophobic character. Because phospholipids function primarily in cell membranes, and because the properties of cell membranes depend on their phospholipid components, they will be discussed further in Sections 4.3 and 15.2 in connection with cell membranes.

CH3

Phosphate

HO Cholesterol

OH CH3

CH3

–

O

+

H3C N CH2 CH2 O P O CH2

CH3

CH3

Choline

O H H H H H H H H H H H H H H H H H H2C O C C C C C C C C C C C C C C C C C C H H H H H H H H H H H H H H H H H H

OH CH3 Polar head group

HO Estrogen

Figure 2.21 The structure of steroids. All steroids share the basic four-ring skeleton. The seemingly minor differences in chemical structure between cholesterol, testosterone, and estrogen generate profound biological differences.

O H H H H H H H H H H H H H H H H H HC O C C C C C C C C C C C C C C C C C C H H H H H H H H H H H H H H H H H H

Glycerol backbone

Fatty acid chains

Figure 2.22 The phospholipid phosphatidylcholine. The molecule consists of a glycerol backbone whose hydroxyl groups are covalently bonded to two fatty acids and a phosphate group. The negatively charged phosphate is also bonded to a small, positively charged choline group. The end of the molecule that contains the phosphorylcholine is hydrophilic, whereas the opposite end, consisting of the fatty acid tail, is hydrophobic. The structure and function of phosphatidylcholine and other phospholipids are discussed at length in Section 4.3.

2.5 Four Types of Biological Molecules

O Testosterone

O

50

Proteins

Chapter 2 The Chemical Basis of Life

Proteins are the macromolecules that carry out virtually all of a cell’s activities; they are the molecular tools and machines that make things happen. As enzymes, proteins vastly accelerate the rate of metabolic reactions; as structural cables, proteins provide mechanical support both within cells and outside their perimeters (Figure 2.23a); as hormones, growth factors, and gene activators, proteins perform a wide variety of regulatory functions; as membrane receptors and transporters, proteins determine what a cell reacts to and what types of substances enter or leave the cell; as contractile filaments and molecular motors, proteins constitute the machinery for biological movements. Among their many other functions, proteins act as antibodies, serve as toxins, form blood clots, absorb or refract light (Figure 2.23b), and transport substances from one part of the body to another. How can one type of molecule have so many varied functions? The explanation resides in the virtually unlimited molecular structures that proteins, as a group, can assume. Each protein, however, has a unique and defined structure that enables it to carry out a particular function. Most importantly, proteins have shapes and surfaces that allow them to interact selectively with other molecules. Proteins, in other words, exhibit a high degree of specificity. It is possible, for example, for a particular DNA-cutting enzyme to recognize a segment of DNA containing one specific sequence of eight nucleotides, while ignoring all the other 65,535 possible sequences composed of this number of nucleotides.

quence of amino acids that gives the molecule its unique properties. Many of the capabilities of a protein can be understood by examining the chemical properties of its constituent amino acids. Twenty different amino acids are commonly used in the construction of proteins, whether from a virus or a human. There are two aspects of amino acid structure to consider: that which is common to all of them and that which is unique to each. We will begin with the shared properties.

The Building Blocks of Proteins Proteins are polymers made of amino acid monomers. Each protein has a unique se-

The Structures of Amino Acids All amino acids have a carboxyl group and an amino group, which are separated from each other by a single carbon atom, the -carbon (Figure 2.24a,b). In a neutral aqueous solution, the -carboxyl group loses its proton and exists in a negatively charged state (—COO), and the -amino group accepts a proton and exists in a positively charged state (NH 3 ) (Figure 2.24b). We saw on page 44 that carbon atoms bonded to four different groups can exist in two configurations (stereoisomers) that cannot be superimposed on one another. Amino acids also have asymmetric carbon atoms. With the exception of glycine, the -carbon of amino acids bonds to four different groups so that each amino acid can exist in either a D or an L form (Figure 2.25). Amino acids used in the synthesis of a protein on a ribosome are always L-amino acids. The “selection” of L-amino acids must have occurred very early in cellular evolution and has been conserved for billions of years. Microorganisms, however, use D-amino acids in the synthesis of certain small peptides, including those of the cell wall and several antibiotics (e.g., gramicidin A). During the process of protein synthesis, each amino acid becomes joined to two other amino acids, forming a long, continuous, unbranched polymer called a polypeptide chain.

(a)

(b)

Figure 2.23 Two examples of the thousands of biological structures composed predominantly of protein. These include (a) feathers, which are adaptations in birds for thermal insulation, flight, and sex

recognition; and (b) the lenses of eyes, as in this spider, which focus light rays. (A: DARRELL GULIN/GETTY IMAGES; B: THOMAS SHAHAN/PHOTO RESEARCHERS, INC.)

51

The amino acids that make up a polypeptide chain are joined by peptide bonds that result from the linkage of the carboxyl group of one amino acid to the amino group of its neighbor, with the elimination of a molecule of water (Figure 2.24c). A polypeptide chain composed of a string of amino acids joined by peptide bonds has the following backbone:

R α

−

C

H

H

+

C O

H

N

H

O

(a) Side Chain H + N

H

Peptide bond

R α

H

C

C

H

O

Amino group

− O N

Carboxyl group

H

(b) R' H

R"

N

C

C

H

H

O

OH

+

H

N

C

C

H

H

O

OH

OH

H2O R' H

R"

N

C

C

N

C

C

H

H

O

H

H

O

Peptide bond

O

H

C

N H

R C

H

C

C

N

R

O

H

O

H

C

N H

R C

C

C

R

O

H

The “average” polypeptide chain contains about 450 amino acids. The longest known polypeptide, found in the muscle protein titin, contains more than 30,000 amino acids. Once incorporated into a polypeptide chain, amino acids are termed residues. The residue on one end of the chain, the N-terminus, contains an amino acid with a free (unbonded) -amino group, whereas the residue at the opposite end, the C-terminus, has a free -carboxyl group. In addition to amino acids, many proteins contain other types of components that are added after the polypeptide is synthesized. These include carbohydrates (to form glycoproteins), metal-containing groups (to form metalloproteins) and organic groups (e.g., flavoproteins).

(c)

Figure 2.24 Amino acid structure. Ball-and-stick model (a) and chemical formula (b) of a generalized amino acid in which R can be any of a number of chemical groups (see Figure 2.26). (c) The formation of a peptide bond occurs by the condensation of two amino acids, drawn here in the uncharged state. In the cell, this reaction occurs on a ribosome as an amino acid is transferred from a carrier (a tRNA molecule) onto the end of the growing polypeptide chain (see Figure 11.49).

-Alanine

D

L

C H

Mirror CH3 NH2

CH3 NH2

COOH C H

1. Polar, charged. Amino acids of this group include aspartic Figure 2.25 Amino acid stereoisomerism. Because the -carbon of all amino acids except glycine is bonded to four different groups, two stereoisomers can exist. The D and L forms of alanine are shown.

acid, glutamic acid, lysine, and arginine. These four amino acids contain side chains that become fully charged; that is, the side chains contain relatively strong organic acids and bases. The ionization reactions of glutamic acid and

2.5 Four Types of Biological Molecules

COOH

-Alanine

The Properties of the Side Chains The backbone, or main chain, of the polypeptide is composed of that part of each amino acid that is common to all of them. The side chain or R group (Figure 2.24), bonded to the -carbon, is highly variable among the 20 building blocks, and it is this variability that ultimately gives proteins their diverse structures and activities. If the various amino acid side chains are considered together, they exhibit a large variety of structural features, ranging from fully charged to hydrophobic, and they can participate in a wide variety of covalent and noncovalent bonds. As discussed in the following chapter, the side chains of the “active sites” of enzymes can facilitate (catalyze) many different organic reactions. The assorted characteristics of the side chains of the amino acids are important in both intramolecular interactions, which determine the structure and activity of the molecule, and intermolecular interactions, which determine the relationship of a polypeptide with other molecules, including other polypeptides (page 61). Amino acids are classified on the character of their side chains. They fall roughly into four categories: polar and charged, polar and uncharged, nonpolar, and those with unique properties (Figure 2.26).

52 +

Polar charged

NH3

C NH CH2

O–

O –

+ NH3

NH

C

CH2

CH2

C

CH2

CH2

CH2

CH2

CH2

CH2

O

O

+

+

–

H3N C C O

+

–

H3N C C O

H O Aspartic acid (Asp or D)

H3N C C O

H O Glutamic acid (Glu or E)

HC NH + CH C NH CH2

CH2 +

–

–

+

H3N C C O

H O Lysine (Lys or K)

H3N C C O–

H O Arginine (Arg or R)

H O Histidine (His or H)

Properties of side chains (R groups): Hydrophilic side chains act as acids or bases which tend to be fully charged (+ or –) under physiologic conditions. Side chains form ionic bonds and are often involved in chemical reactions. Polar uncharged

O C

OH

CH3

CH2

+

H3N C C O

H O Serine (Ser or S)

C

H3N C C O

H O Threonine (Thr or T)

NH2 CH2

CH2

CH2 –

+

H3N C C O

O

CH2

H C OH –

+

OH

NH2

–

+

H O Glutamine (Gln or Q)

–

+

H3N C C O

H3N C C O

H O Asparagine (Asn or N)

–

H O Tyrosine (Tyr or Y)

Properties of side chains: Hydrophilic side chains tend to have partial + or – charge allowing them to participate in chemical reactions, form H-bonds, and associate with water. Nonpolar CH3 CH3 CH3

CH3 – H3N C C O

+

CH3 CH

– H3N C C O

+

H O Alanine (Ala or A)

CH3 CH

S

CH2

CH2

H C CH3

CH2 +

H3N C C O

H O Valine (Val or V)

CH3

–

+

– H3N C C O

H O Isoleucine (Ile or I)

H O Leucine (Leu or L)

NH C CH

CH2 +

H3N C C O– H O Methionine (Met or M)

CH2 +

H3N C C O–

CH2 +

H3N C C O–

H O Phenylalanine (Phe or F)

H O Tryptophan (Trp or W)

Properties of side chains: Hydrophobic side chain consists almost entirely of C and H atoms. These amino acids tend to form the inner core of soluble proteins, buried away from the aqueous medium. They play an important role in membranes by associating with the lipid bilayer. Side chains with unique properties

Chapter 2 The Chemical Basis of Life

H +

H3N C C O–

+

CH2 CH2

CH2

CH2 CH C O– N O + H2 Proline (Pro or P)

H3N C C O–

H O Glycine (Gly or G) Side chain consists only of hydrogen atom and can fit into either a hydrophilic or hydrophobic environment.Glycine often resides at sites where two polypeptides come into close contact.

SH

H O Cysteine (Cys or C) Though side chain has polar, uncharged character, it has the unique property of forming a covalent bond with another cysteine to form a disulfide link.

Figure 2.26 The chemical structure of amino acids. These 20 amino acids represent those most commonly found in proteins and, more specifically, those encoded by DNA. Other amino acids occur as the result of a modification to one of those shown here. The amino acids

Though side chain has hydrophobic character, it has the unique property of creating kinks in polypeptide chains and disrupting ordered secondary structure.

are arranged into four groups based on the character of their side chains, as described in the text. All molecules are depicted as free amino acids in their ionized state as they would exist in solution at neutral pH.

53 OH

O

CH2

O

O

C CH2

–

C –

CH2

+

CH2

OH H

N C C

N C C

H H O

H H O

+

H

+

H

+

(a) H +

NH2

NH2

CH2

CH2

CH2

–

CH2

OH

CH2

H

+

CH2 CH2

+

CH2

N C C

N C C

H H O

H H O

(b)

Figure 2.27 The ionization of charged, polar amino acids. (a) The side chain of glutamic acid loses a proton when its carboxylic acid group ionizes. The degree of ionization of the carboxyl group depends on the pH of the medium: the greater the hydrogen ion concentration (the lower the pH), the smaller the percentage of carboxyl groups that are present in the ionized state. Conversely, a rise in pH leads to an increased ionization of the proton from the carboxyl group, increasing the percentage of negatively charged glutamic acid side chains. The pH at which 50 percent of the side chains are ionized and 50 percent are unionized is called the pK, which is 4.4 for the side chain of free glutamic acid. At physiologic pH, virtually all of the glutamic acid residues of a polypeptide are negatively charged. (b) The side chain of lysine becomes ionized when its amino group gains a proton. The greater the hydroxyl ion concentration (the higher the pH), the smaller the percentage of amino groups that are positively charged. The pH at which 50 percent of the side chains of lysine are charged and 50 percent are uncharged is 10.0, which is the pK for the side chain of free lysine. At physiologic pH, virtually all of the lysine residues of a polypeptide are positively charged. Once incorporated into a polypeptide, the pK of a charged group can be greatly influenced by the surrounding environment.

form hydrogen bonds with other molecules including water. These amino acids are often quite reactive. Included in this category are asparagine and glutamine (the amides of aspartic acid and glutamic acid), threonine, serine, and tyrosine. 3. Nonpolar. The side chains of these amino acids are hydrophobic and are unable to form electrostatic bonds or interact with water. The amino acids of this category are alanine, valine, leucine, isoleucine, tryptophan, phenylalanine, and methionine. The side chains of the nonpolar amino acids generally lack oxygen and nitrogen. They vary primarily in size and shape, which allows one or another of them to pack tightly into a particular space within the core of a protein, associating with one another as the result of van der Waals forces and hydrophobic interactions. 4. The other three amino acids—glycine, proline, and cysteine—have unique properties that separate them from the others. The side chain of glycine consists of only a hydrogen atom, and glycine is a very important amino acid for just this reason. Owing to its lack of a side chain, glycine residues provide a site where the backbones of two polypeptides (or two segments of the same polypeptide) can approach one another very closely. In addition, glycine is more flexible than other amino acids and allows parts of the backbone to move or form a hinge. Proline is unique in having its -amino group as part of a ring (making it an imino acid). Proline is a hydrophobic amino acid that does not readily fit into an ordered secondary structure, such as an helix (page 55), often producing kinks or hinges. Cysteine contains a reactive sulfhydryl (—SH) group and is often covalently linked to another cysteine residue, as a disulfide (—SS—) bridge. Cysteine H

H

O

H

H

O

N

C

C

N

C

C

CH2

CH2 Oxidation

S

SH

Reduction

S

CH2

+ 2H+ + 2e–

CH2

N

C

C

N

C

C

H

H

O

H

H

O

Disulfide bridges often form between two cysteines that are distant from one another in the polypeptide backbone or even in two separate polypeptides. Disulfide bridges help stabilize the intricate shapes of proteins, particularly those present outside of cells where they are subjected to added physical and chemical stress. Not all of the amino acids described in this section are found in all proteins, nor are the various amino acids distributed in an equivalent manner. A number of other amino acids

2.5 Four Types of Biological Molecules

lysine are shown in Figure 2.27. At physiologic pH, the side chains of these amino acids are almost always present in the fully charged state. Consequently, they are able to form ionic bonds with other charged species in the cell. For example, the positively charged arginine residues of histone proteins are linked by ionic bonds to the negatively charged phosphate groups of DNA (see Figure 2.3). Histidine is also considered a polar, charged amino acid, though in most cases it is only partially charged at physiologic pH. In fact, because of its ability to gain or lose a proton in physiologic pH ranges, histidine is a particularly important residue in the active site of many proteins (as in Figure 3.13). 2. Polar, uncharged. The side chains of these amino acids have a partial negative or positive charge and thus can

SH

54

Chapter 2 The Chemical Basis of Life

are also found in proteins, but they arise by alterations to the side chains of the 20 basic amino acids after their incorporation into a polypeptide chain. For this reason they are called posttranslational modifications (PTMs). Dozens of different types of PTMs have been documented. The most widespread and important PTM is the reversible addition of a phosphate group to a serine, threonine, or tyrosine residue. Lysine acetylation is another widespread and important PTM affecting thousands of proteins in a mammalian cell. PTMs can generate dramatic changes in the properties and function of a protein, most notably by modifying its three-dimensional structure, level of activity, localization within the cell, life span, and/or its interactions with other molecules. The presence or absence of a single phosphate group on a key regulatory protein has the potential to determine whether or not a cell will behave as a cancer cell or a normal cell. Because of PTMs, a single polypeptide can exist as a number of distinct biological molecules. The ionic, polar, or nonpolar character of amino acid side chains is very important in protein structure and function. Most soluble (i.e., nonmembrane) proteins are constructed so that the polar residues are situated at the surface of the molecule where they can associate with the surrounding water and contribute to the protein’s solubility in aqueous solution

(a)

Figure 2.28 Disposition of hydrophilic and hydrophobic amino acid residues in the soluble protein cytochrome c. (a) The hydrophilic side chains, which are shown in green, are located primarily at the surface of the protein where they contact the surrounding aqueous medium. (b) The hydrophobic residues, which are shown in red, are located

(Figure 2.28a). In contrast, the nonpolar residues are situated predominantly in the core of the molecule (Figure 2.28b). The hydrophobic residues of the protein interior are often tightly packed together, creating a type of three-dimensional jigsaw puzzle in which water molecules are generally excluded. Hydrophobic interactions among the nonpolar side chains of these residues are a driving force during protein folding (page 64) and contribute substantially to the overall stability of the protein. In many enzymes, reactive polar groups project into the nonpolar interior, giving the protein its catalytic activity. For example, a nonpolar environment can enhance ionic interactions between charged groups that would be lessened by competition with water in an aqueous environment. Some reactions that might proceed at an imperceptibly slow rate in water can occur in millionths of a second within the protein. The Structure of Proteins Nowhere in biology is the intimate relationship between form and function better illustrated than with proteins. The structure of most proteins is completely defined and predictable. Each amino acid in one of these giant macromolecules is located at a specific site within the structure, giving the protein the precise shape and reactivity required for the job at hand. Protein structure can be described at several levels of organization, each emphasizing a

(b)

primarily within the center of the protein, particularly in the vicinity of the central heme group. (ILLUSTRATION, IRVING GEIS. IMAGE FROM IRVING GEIS COLLECTION/HOWARD HUGHES MEDICAL INSTITUTE. RIGHTS OWNED BY HHMI. REPRODUCED BY PERMISSION ONLY.)

55

Figure 2.29 Scanning electron micrograph of a red blood cell from a person with sickle cell anemia. Compare with the micrograph of a normal red blood cell of Figure 4.32a. (COURTESY OF J. T. THORNWAITE, B. F. CAMERON, AND R. C. LEIF.)

different aspect and each dependent on different types of interactions. Customarily, four such levels are described: primary, secondary, tertiary, and quaternary. The first, primary structure, concerns the amino acid sequence of a protein, whereas the latter three levels concern the organization of the molecule in space. To understand the mechanism of action and biological function of a protein it is essential to know how that protein is constructed.

Secondary Structure All matter exists in space and therefore has a three-dimensional expression. Proteins are formed by linkages among vast numbers of atoms; consequently their shape is complex. The term conformation refers to the threedimensional arrangement of the atoms of a molecule, that is, to their spatial organization. Secondary structure describes the conformation of portions of the polypeptide chain. Early studies on secondary structure were carried out by Linus Pauling and Robert Corey of the California Institute of Technology. By studying the structure of simple peptides consisting of a few amino acids linked together, Pauling and Corey concluded that polypeptide chains exist in preferred conformations that provide the maximum possible number of hydrogen bonds between neighboring amino acids. Two conformations were proposed. In one conformation, the backbone of the polypeptide assumed the form of a cylindrical, twisting spiral called the alpha (␣) helix (Figure 2.30a,b). The backbone lies on the inside of the helix, and the side chains project outward. The helical structure is stabilized by hydrogen bonds between the atoms of one peptide bond and those situated just above and below it along the spiral (Figure 2.30c). The X-ray diffraction patterns of actual proteins produced during the 1950s bore out the existence of the helix, first in the protein keratin found in hair and later in various oxygen-binding proteins, such as myoglobin and hemoglobin (see Figure 2.34). Surfaces on opposite sides of an helix may have contrasting properties. In water-soluble proteins, the outer surface of an helix often contains polar residues in contact with the solvent, whereas the surface facing inward typically contains nonpolar side chains.

2.5 Four Types of Biological Molecules

Primary Structure The primary structure of a polypeptide is the specific linear sequence of amino acids that constitute the chain. With 20 different building blocks, the number of different polypeptides that can be formed is 20n, where n is the number of amino acids in the chain. Because most polypeptides contain well over 100 amino acids, the variety of possible sequences is essentially unlimited. The information for the precise order of amino acids in every protein that an organism can produce is encoded within the genome of that organism. As we will see later, the amino acid sequence provides the information required to determine a protein’s threedimensional shape and thus its function. The sequence of amino acids, therefore, is all-important, and changes that arise in the sequence as a result of genetic mutations in the DNA may not be readily tolerated. The earliest and best-studied example of this relationship is the change in the amino acid sequence of hemoglobin that causes the disease sickle cell anemia. This severe, inherited anemia results solely from a single change in amino acid sequence within the hemoglobin molecule: a nonpolar valine residue is present where a charged glutamic acid is normally located. This change in hemoglobin structure can have a dramatic effect on the shape of red blood cells, converting them from disk-shaped cells to sickleshaped cells (Figure 2.29), which tend to clog small blood vessels, causing pain and life-threatening crises. Not all amino acid changes have such a dramatic effect, as evidenced by the

differences in amino acid sequence in the same protein among related organisms. The degree to which changes in the primary sequence are tolerated depends on the degree to which the shape of the protein or the critical functional residues are disturbed. The first amino acid sequence of a protein was determined by Frederick Sanger and co-workers at Cambridge University in the early 1950s. Beef insulin was chosen for the study because of its availability and its small size—two polypeptide chains of 21 and 30 amino acids each. The sequencing of insulin was a momentous feat in the newly emerging field of molecular biology. It revealed that proteins, the most complex molecules in cells, have a definable substructure that is neither regular nor repeating, unlike those of polysaccharides. Each particular polypeptide, whether insulin or some other species, has a precise sequence of amino acids that does not vary from one molecule to another. With the advent of techniques for rapid DNA sequencing (see Section 18.14), the primary structure of a polypeptide can be deduced from the nucleotide sequence of the encoding gene. In the past few years, the complete sequences of the genomes of hundreds of organisms, including humans, have been determined. This information will eventually allow researchers to learn about every protein that an organism can manufacture. However, translating information about primary sequence into knowledge of higher levels of protein structure remains a formidable challenge.

56 Figure 2.30 The alpha helix. (a) Tubular representation of an helix. The atoms of the main chain would just fit within a tube of this radius. (b) The helical path around a central axis taken by the polypeptide backbone in a region of helix. Each complete (360) turn of the helix corresponds to 3.6 amino acid residues. The distance along the axis between adjacent residues is 1.5 Å. (c) The arrangement of the atoms of the backbone of the helix and the hydrogen bonds that form between amino acids. Because of the helical rotation, the peptide bonds of every fourth amino acid come into close proximity. The approach of the carbonyl group (CPO) of one peptide bond to the imine group (H—N) of another peptide bond results in the formation of hydrogen bonds between them. The hydrogen bonds (orange bars) are essentially parallel to the axis of the cylinder and thus hold the turns of the chain together. (A: FROM JAYNATH R. BANAVER AND AMOS MARITAN. FIGURE CREATED BY TIMOTHY LEZON. REPRINTED WITH PERMISSION FROM THE ANNUAL REVIEW OF BIOPHYSICS, VOLUME 36; ©2007, BY ANNUAL REVIEWS, INC.)

H

C

C

N N

N

C

C

C

R

N C

N

C

C

3.6 residues

N

C C

N C

N

H

O R C H

H

C

N

R C

H

R

C C

N

N C CH O N O H C R C H N O H C C R H N O H C R C N C O R C H O

C

C

H

O

C N

N

C

C

Figure 2.31 The -pleated sheet. (a) Tubular representation of an antiparallel sheet. (b) Each polypeptide of a sheet assumes an extended but pleated conformation referred to as a strand. The pleats result from the location of the -carbons above and below the plane of the sheet. Successive side chains (R groups in the figure) project upward and downward from the backbone. The distance along the axis between adjacent residues is 3.5 Å (c) A -pleated sheet consists of a number of strands that lie parallel to one another and are joined together by a regular array of hydrogen bonds between the carbonyl and imine groups of the neighboring backbones. Neighboring segments of the polypeptide backbone may lie either parallel (in the same N-terminal → C-terminal direction) or antiparallel (in the opposite N-terminal → C-terminal direction). (A: FROM JAYNATH R. (a) BANAVER AND AMOS MARITAN. FIRURE CREATED BY TIMOTHY LEZON. REPRINTED WITH PERMISSION FROM THE ANNUAL REVIEW OF BIOPHYSICS, VOLUME 36; © 2007, BY ANNUAL REVIEWS, INC.; B. C: ILLUSTRATION, IRVING GEIS. IMAGE

N C

C

(a)

C H

C

O

C O R NC H H C H N C O H C

C

The second conformation proposed by Pauling and Corey was the beta (␤)-pleated sheet, which consists of several segments of a polypeptide lying side by side (Figure 2.31a). Unlike the coiled, cylindrical form of the helix, the backbone of each segment of polypeptide (or ␤ strand) in a sheet assumes a folded or pleated conformation (Figure 2.31b). Like the helix, the sheet is also characterized by a large number

Chapter 2 The Chemical Basis of Life

N C CR O C

Hydrogen bond

H

(c)

(b)

of hydrogen bonds, but these are oriented perpendicular to the long axis of the polypeptide chain and project across from one part of the chain to another (Figure 2.31c). The strands of a sheet can be arranged either parallel or antiparallel (as in Figure 2.31a) to one another. Like the helix, the sheet has also been found in many different proteins. Because strands are highly extended, the sheet resists pulling (tensile) forces.

C

C

N

N

R

R

R

C

C

C

C

N

C

N

C

C

N

N

R C C

C

N

C

C

C

C

R

R

R

R

N

(b)

o

7.0A (c) FROM THE IRVING

GEIS COLLECTION/HOWARD HUGHES MEDICAL INSTITUTE. RIGHTS OWNED BY HHMI. REPRODUCTION BY PERMISSION ONLY.)

57

Figure 2.32 A ribbon model of ribonuclease. The regions of helix are depicted as spirals and strands as flattened ribbons with the arrows indicating the N-terminal → C-terminal direction of the polypeptide. Those segments of the chain that do not adopt a regular secondary structure (i.e., an helix or strand) consist largely of loops and turns. Disulfide bonds are shown in yellow. (HAND DRAWN BY JANE S. RICHARDSON.)

Tertiary Structure The next level above secondary structure is tertiary structure, which describes the conformation of the entire polypeptide. Whereas secondary structure is stabilized primarily by hydrogen bonds between atoms that form the peptide bonds of the backbone, tertiary structure is stabilized by an array of noncovalent bonds between the diverse side chains of the protein. Secondary structure is largely limited to a small number of conformations, but tertiary structure is virtually unlimited.

The detailed tertiary structure of a protein is usually determined using the technique of X-ray crystallography.5 In this technique (which is described in more detail in Sections 3.2 and 18.8), a crystal of the protein is bombarded by a thin beam of X-rays, and the radiation that is scattered (diffracted) by the electrons of the protein’s atoms is allowed to strike a radiation-sensitive plate or detector, forming an image of spots, such as those of Figure 2.33. When these diffraction patterns are subjected to complex mathematical analysis, an investigator can work backward to derive the structure responsible for producing the pattern. For many years it was presumed that all proteins had a fixed three-dimensional structure, which gave each protein its unique properties and specific functions. It came as a surprise to discover over the past decade or so that many proteins of higher organisms contain sizable segments that lack a defined conformation. Examples of proteins containing these types of unstructured (or disordered) segments can be seen in the models of the PrP protein in Figure 1 on page 67 and the histone tails in Figure 12.13c. The disordered regions in these proteins are depicted as dashed lines in the images, conveying the 5

The three-dimensional structure of small proteins can also be determined by nuclear magnetic resonance (NMR) spectroscopy, which is not discussed in this text (see the supplement to the July issue of Nature Struct. Biol., 1998, Nature Struct. Biol. 7:982, 2000, and Trends Biochem. Sci. 34: 601, 2009 for reviews of this technology). Figure 1a, page 67, shows an NMR-derived structure.

2.5 Four Types of Biological Molecules

Silk is composed of a protein containing an extensive amount of sheet; silk fibers are thought to owe their strength to this architectural feature. Remarkably, a single fiber of spider silk, which may be a tenth the thickness of a human hair, is roughly five times stronger than a steel fiber of comparable weight. Those portions of a polypeptide chain not organized into an helix or a sheet may consist of hinges, turns, loops, or finger-like extensions. Often, these are the most flexible portions of a polypeptide chain and the sites of greatest biological activity. For example, antibody molecules are known for their specific interactions with other molecules (antigens); these interactions are mediated by a series of loops at one end of the antibody molecule (see Figures 17.15 and 17.16). The various types of secondary structures are most simply depicted as shown in Figure 2.32: helices are represented by helical ribbons, strands as flattened arrows, and connecting segments as thinner strands.

Figure 2.33 An X-ray diffraction pattern of myoglobin. The pattern of spots is produced as a beam of X-rays is diffracted by the atoms in the protein crystal, causing the X-rays to strike the film at specific sites. Information derived from the position and intensity (darkness) of the spots can be used to calculate the positions of the atoms in the protein that diffracted the beam, leading to complex structures such as that shown in Figure 2.34. (COURTESY OF JOHN C. KENDREW.)

58

fact that these segments of the polypeptide (like pieces of spaghetti) can occupy many different positions and, thus, cannot be studied by X-ray crystallography. Disordered segments tend to have a predictable amino acid composition, being enriched in charged and polar residues and deficient in hydrophobic residues. You might be wondering whether proteins lacking a fully defined structure could be engaged in a useful function. In fact, the disordered regions of such proteins play key roles in vital cellular processes, often binding to DNA or to other proteins. Remarkably, these segments often undergo a physical transformation once they bind to an appropriate partner and are then seen to possess a defined, folded structure. Most proteins can be categorized on the basis of their overall conformation as being either fibrous proteins, which have an elongated shape, or globular proteins, which have a compact shape. Most proteins that act as structural materials outside living cells are fibrous proteins, such as collagens and elastins of connective tissues, keratins of hair and skin, and silk. These proteins resist pulling or shearing forces to which they are exposed. In contrast, most proteins within the cell are globular proteins.

Chapter 2 The Chemical Basis of Life

Myoglobin: The First Globular Protein Whose Tertiary Structure Was Determined The polypeptide chains of globular proteins are folded and twisted into complex shapes. Distant points on the linear sequence of amino acids are brought next to each other and linked by various types of bonds. The first glimpse at the tertiary structure of a globular protein came in 1957 through the X-ray crystallographic studies of John Kendrew and his colleagues at Cambridge University using X-ray dif-

(a)

Figure 2.34 The three-dimensional structure of myoglobin. (a) The tertiary structure of whale myoglobin. Most of the amino acids are part of helices. The nonhelical regions occur primarily as turns, where the polypeptide chain changes direction. The position of the heme is indicated in red. (b) The three-dimensional structure of myoglobin (heme

fraction patterns such as that shown in Figure 2.33. The protein they reported on was myoglobin. Myoglobin functions in muscle tissue as a storage site for oxygen; the oxygen molecule is bound to an iron atom in the center of a heme group. (The heme is an example of a prosthetic group, i.e., a portion of the protein that is not composed of amino acids, which is joined to the polypeptide chain after its assembly on the ribosome.) It is the heme group of myoglobin that gives most muscle tissue its reddish color. The first report on the structure of myoglobin provided a low-resolution profile sufficient to reveal that the molecule was compact (globular) and that the polypeptide chain was folded back on itself in a complex arrangement. There was no evidence of regularity or symmetry within the molecule, such as that revealed in the earlier description of the DNA double helix. This was not surprising considering the singular function of DNA and the diverse functions of protein molecules. The earliest crude profile of myoglobin revealed eight rod-like stretches of helix ranging from 7 to 24 amino acids in length. Altogether, approximately 75 percent of the 153 amino acids in the polypeptide chain is in the -helical conformation. This is an unusually high percentage compared with that for other proteins that have since been examined. No -pleated sheet was found. Subsequent analyses of myoglobin using additional X-ray diffraction data provided a much more detailed picture of the molecule (Figures 2.34a and 3.16). For example, it was shown that the heme group is situated within a pocket of hydrophobic side chains that promotes the binding of oxygen without the oxidation (loss of electrons) of the iron atom. Myoglobin contains no disulfide bonds; the tertiary structure of the protein is held together

(b)

indicated in red). The positions of all of the molecule’s atoms, other than hydrogen, are shown. (A: ILLUSTRATION, IRVING GEIS. IMAGE FROM IRVING GEIS COLLECTION/HOWARD HUGHES MEDICAL INSTITUTE. RIGHTS OWNED BY HHMI. REPRODUCED BY PERMISSION ONLY; B: KEN EWARD/PHOTO RESEARCHERS.)

59

exclusively by noncovalent interactions. All of the noncovalent bonds thought to occur between side chains within proteins— hydrogen bonds, ionic bonds, and van der Waals forces—have been found (Figure 2.35). Unlike myoglobin, most globular proteins contain both helices and sheets. Most importantly, these early landmark studies revealed that each protein has a unique tertiary structure that can be correlated with its amino acid sequence and its biological function. Protein Domains Unlike myoglobin, most eukaryotic proteins are composed of two or more spatially distinct modules, or domains, that fold independent of one another. For example, the mammalian enzyme phospholipase C, shown in the central part of Figure 2.36, consists of four distinct domains, colored differently in the drawing. The different domains of a polypeptide often represent parts that function in a semi-independent manner. For example, they might bind different factors, such as a coenzyme and a substrate or a DNA strand and another protein, or they might move relatively independent of one another. Protein domains are often identified with a specific function. For example, proteins containing a PH domain bind to membranes containing a specific phospholipid, whereas proteins containing a chromodomain bind to a methylated lysine residue in another protein. The functions of a newly identified protein can usually be predicted by the domains of which it is made. Many polypeptides containing more than one domain are thought to have arisen during evolution by the fusion of genes that encoded different ancestral proteins, with each domain representing a part that was once a separate molecule. Each domain of the mammalian phospholipase C molecule, for example, has been identified as a homologous unit in another

protein (Figure 2.36). Some domains have been found only in one or a few proteins. Other domains have been shuffled widely about during evolution, appearing in a variety of proteins whose other regions show little or no evidence of an evolutionary relationship. Shuffling of domains creates proteins with unique combinations of activities. On average, mammalian proteins tend to be larger and contain more domains than proteins of less complex organisms, such as fruit flies and yeast. Dynamic Changes within Proteins Although X-ray crystallographic structures possess exquisite detail, they are static images frozen in time. Proteins, in contrast, are not static and inflexible, but capable of considerable internal movements. Proteins are, in other words, molecules with “moving parts.” Because they are tiny, nanometer-sized objects, proteins can be greatly influenced by the energy of their environment. Random, small-scale fluctuations in the arrangement of the bonds within a protein create an incessant thermal motion within the molecule. Spectroscopic techniques, such as nuclear magnetic resonance (NMR), can monitor dynamic Bacterial phospholipase C Troponin C

van der Waals forces Mammalian phospholipase C

CH CH3

CH3 CH3

CH3

CH3 CH3

CH

Hydrogen bond O C

Synaptotagmin

NH2

O CH2 CH2 CH2 CH2

Recoverin

–

NH3+ O C CH2

Ionic bond

Figure 2.35 Types of noncovalent bonds maintaining the conformation of proteins.

Figure 2.36 Proteins are built of structural units, or domains. The mammalian enzyme phospholipase C is constructed of four domains, indicated in different colors. The catalytic domain of the enzyme is shown in blue. Each of the domains of this enzyme can be found independently in other proteins as indicated by the matching color. (FROM LIISA HOLM AND CHRIS SANDER, STRUCTURE 5:167, © 2007, WITH PERMISSION FROM ELSEVIER.)

2.5 Four Types of Biological Molecules

C OH

Chapter 2 The Chemical Basis of Life

60

movements within proteins, and they reveal shifts in hydrogen bonds, waving movements of external side chains, and the full rotation of the aromatic rings of tyrosine and phenylalanine residues about one of the single bonds. The important role that such movements can play in a protein’s function is illustrated in studies of acetylcholinesterase, the enzyme responsible for degrading acetylcholine molecules that are left behind following the transmission of an impulse from one nerve cell to another (Section 4.8). When the tertiary structure of acetylcholinesterase was first revealed by X-ray crystallography, there was no obvious pathway for acetylcholine molecules to enter the enzyme’s catalytic site, which was situated at the bottom of a deep gorge in the molecule. In fact, the narrow entrance to the site was completely blocked by the presence of a number of bulky amino acid side chains. Using high-speed computers, researchers have been able to simulate the random movements of thousands of atoms within the enzyme, a feat that cannot be accomplished using experimental techniques. These molecular dynamic (MD) simulations indicated that movements of side chains within the protein would lead to the rapid opening and closing of a “gate” that would allow acetylcholine molecules to diffuse into the enzyme’s catalytic site (Figure 2.37). The X-ray crystallographic structure of a protein (i.e., its crystal structure) can be considered an average structure, or “ground state.” A protein can undergo dynamic excursions from the ground state and assume alternate conformations that are accessible based on the amount of energy that the protein contains. Predictable (nonrandom) movements within a protein that are triggered by the binding of a specific molecule are described as conformational changes. Conformational changes typically involve the coordinated movements of various parts of the molecule. A comparison of the polypeptides depicted in Figures 3a and 3b on page 82 shows the dramatic conformational change that occurs in a bacterial protein (GroEL) when it interacts with another protein (GroES). Virtually every activity in which a protein takes part is accompanied by conformational changes within the molecule (see http://molmovdb.org to watch examples). The conformational change in the protein myosin that occurs during muscle contraction is shown in Figures 9.60 and 9.61. In this case, binding of myosin to an actin molecule leads to a small (20) rotation of the head of the myosin, which results in a 50 to 100 Å movement of the adjacent actin filament. The importance of this dynamic event can be appreciated if you consider that the movements of your body result from the additive effect of millions of conformational changes taking place within the contractile proteins of your muscles. Quaternary Structure Whereas many proteins such as myoglobin are composed of only one polypeptide chain, most are made up of more than one chain, or subunit. The subunits may be linked by covalent disulfide bonds, but most often they are held together by noncovalent bonds as occur typically between hydrophobic “patches” on the complementary surfaces of neighboring polypeptides. Proteins composed of subunits are said to have quaternary structure. Depending on the protein, the polypeptide chains may be identical or nonidentical. A protein composed of two identical subunits is described as a

Figure 2.37 Dynamic movements within the enzyme acetylcholinesterase. A portion of the enzyme is depicted here in two different conformations: (1) a closed conformation (left) in which the entrance to the catalytic site is blocked by the presence of aromatic rings that are part of the side chains of tyrosine and phenylalanine residues (shown in purple) and (2) an open conformation (right) in which the aromatic rings of these side chains have swung out of the way, opening the “gate” to allow acetylcholine molecules to enter the catalytic site. These images are constructed using computer programs that take into account a host of information about the atoms that make up the molecule, including bond lengths, bond angles, electrostatic attraction and repulsion, van der Waals forces, etc. Using this information, researchers are able to simulate the movements of the various atoms over a very short time period, which provides images of the conformations that the protein can assume. An animation of this image can be found on the Web at http://mccammon.ucsd.edu (COURTESY OF J. ANDREW MCCAMMON.)

homodimer, whereas a protein composed of two nonidentical subunits is a heterodimer. A ribbon drawing of a homodimeric protein is depicted in Figure 2.38a. The two subunits of the protein are drawn in different colors, and the hydrophobic residues that form the complementary sites of contact are indicated. The best-studied multisubunit protein is hemoglobin, the O2-carrying protein of red blood cells. A molecule of human hemoglobin consists of two -globin and two -globin polypeptides (Figure 2.38b), each of which binds a single molecule of oxygen. Elucidation of the three-dimensional structure of hemoglobin by Max Perutz of Cambridge University in 1959 was one of the early landmarks in the study of molecular biology. Perutz demonstrated that each of the four globin polypeptides of a hemoglobin molecule has a tertiary structure similar to that of myoglobin, a fact that strongly suggested the two proteins had evolved from a common ancestral polypeptide with a common O2-binding mechanism. Perutz also compared the structure of the oxygenated and deoxygenated versions of hemoglobin. In doing so, he discovered that the binding of oxygen was accompanied by the movement of the bound iron atom closer to the plane of the heme group. This seemingly inconsequential shift in position of a single atom pulled on an helix to which the iron is connected, which in turn led to a series of increasingly larger movements within and between the subunits. This finding revealed for the

61

first time that the complex functions of proteins may be carried out by means of small changes in their conformation. Protein–Protein Interactions Even though hemoglobin consists of four subunits, it is still considered a single protein with a single function. Many examples are known in which

(a)

different proteins, each with a specific function, become physically associated to form a much larger multiprotein complex. One of the first multiprotein complexes to be discovered and studied was the pyruvate dehydrogenase complex of the bacterium Escherichia coli, which consists of 60 polypeptide chains constituting three different enzymes (Figure 2.39). The enzymes that make up this complex catalyze a series of reactions connecting two metabolic pathways, glycolysis and the TCA cycle (see Figure 5.7). Because the enzymes are so closely associated, the product of one enzyme can be channeled directly to the next enzyme in the sequence without becoming diluted in the aqueous medium of the cell. Multiprotein complexes that form within the cell are not necessarily stable assemblies, such as the pyruvate dehydrogenase complex. In fact, most proteins interact with other proteins in highly dynamic patterns, associating and dissociating depending on conditions within the cell at any given time. Interacting proteins tend to have complementary surfaces. Often a projecting portion of one molecule fits into a pocket within its partner. Once the two molecules have come into close contact, their interaction is stabilized by noncovalent bonds. The reddish-colored object in Figure 2.40a is called an SH3 domain, and it is found as part of more than 200 different proteins involved in molecular signaling. The surface of an SH3 domain contains shallow hydrophobic “pockets” that become filled by complementary “knobs” projecting from another protein (Figure 2.40b). A large number of different structural domains have been identified that, like SH3, act as adaptors to mediate interactions between proteins. In many cases, protein–protein interactions are regulated by modifications, such as the addition of a phosphate group to a key amino acid, which act as a switch to turn on or off the protein’s ability to bind a protein partner. As more and more complex molecular activities have been discovered, the importance of

(b)

(a)

20 nm (b)

Figure 2.39 Pyruvate dehydrogenase: a multiprotein complex. (a) Electron micrograph of a negatively stained pyruvate dehydrogenase complex isolated from E. coli. Each complex contains 60 polypeptide chains constituting three different enzymes. Its molecular mass approaches 5 million daltons. (b) A model of the pyruvate dehydrogenase complex. The core of the complex consists of a cube-like cluster of dihydrolipoyl transacetylase molecules. Pyruvate dehydrogenase dimers (black spheres) are distributed symmetrically along the edges of the cube, and dihydrolipoyl dehydrogenase dimers (small gray spheres) are positioned in the faces of the cube. (COURTESY OF LESTER J. REED.)

2.5 Four Types of Biological Molecules

Figure 2.38 Proteins with quaternary structure. (a) Drawing of transforming growth factor-␤2 (TGF-␤2), a protein that is a dimer composed of two identical subunits. The two subunits are colored yellow and blue. Shown in white are the cysteine side chains and disulfide bonds. The spheres shown in yellow and blue are the hydrophobic residues that form the interface between the two subunits. (b) Drawing of a hemoglobin molecule, which consists of two ␣-globin chains and two ␤-globin chains (a heterotetramer) joined by noncovalent bonds. When the four globin polypeptides are assembled into a complete hemoglobin molecule, the kinetics of O2 binding and release are quite different from those exhibited by isolated polypeptides. This is because the binding of O2 to one polypeptide causes a conformational change in the other polypeptides that alters their affinity for O2 molecules. (A: FROM S. DAOPIN ET AL., SCIENCE 257:372, COURTESY OF DAVID R. DAVIES, © 1992, REPRODUCED WITH PERMISSION FROM AAAS; B: ILLUSTRATION, IRVING GEIS. IMAGE FROM IRVING GEIS REPRODUCED WITH PERMISSION FROM AAAS. COLLECTION/ HOWARD HUGHES MEDICAL INSTITUTE. RIGHTS OWNED BY HHMI. REPRODUCED BY PERMISSION ONLY.)

62

(a) X Arg

X X

Pro

X

Pro

(b)

Chapter 2 The Chemical Basis of Life

Figure 2.40 Protein–protein interactions. (a) A model illustrating the complementary molecular surfaces of portions of two interacting proteins. The reddish-colored molecule is an SH3 domain of the enzyme PI3K, whose function is discussed in Chapter 15. This domain binds specifically to a variety of proline-containing peptides, such as the one shown in the space-filling model at the top of the figure. The proline residues in the peptide, which fit into hydrophobic pockets on the surface of the enzyme, are indicated. The polypeptide backbone of the peptide is colored yellow, and the side chains are colored green. (b) Schematic model of the interaction between an SH3 domain and a peptide showing the manner in which certain residues of the peptide fit into hydrophobic pockets in the SH3 domain. (A: FROM HONGTAO YU AND STUART SCHREIBER, CELL 76:940, © 1994, WITH PERMISSION FROM ELSEVIER.)

interactions among proteins has become increasingly apparent. For example, such diverse processes as DNA synthesis, ATP formation, and RNA processing are all accomplished by “molecular machines” that consist of a large number of interacting proteins, some of which form stable relationships and others transient liasons. Several hundred different protein complexes have been purified in large-scale studies on yeast. Most investigators who study protein–protein interactions want to know whether one protein they are working with, call it protein X, interacts physically with another protein, call it protein Y. This type of question can be answered using a genetic technique called the yeast two-hybrid (Y2H) assay, which is discussed in Section 18.7 and illustrated in Figure 18.27. In this technique, genes encoding the two proteins

being studied are introduced into the same yeast cell. If the yeast cell tests positive for a particular reporter protein, which is indicated by an obvious color change in the yeast cell, then the two proteins in question had to have interacted within the yeast cell nucleus. Interactions between proteins are generally studied one at a time, producing the type of data seen in Figure 2.40. In recent years, a number of research teams have set out to study protein–protein interactions on a global scale. For example, one might want to know all of the interactions that occur among the 14,000 or so proteins encoded by the genome of the fruit fly Drosophila melanogaster. Now that the entire genome of this insect has been sequenced, virtually every gene within the genome is available as an individual DNA segment that can be cloned and used as desired. Consequently, it should be possible to test the proteins encoded by the fly genome, two at a time, for possible interactions in a Y2H assay. One study of this type reported that, of the millions of possible combinations, more than 20,000 interactions were detected among 7048 fruit-fly proteins tested. Although the Y2H assay has been the mainstay in the study of protein–protein interactions for more than 15 years, it is an indirect assay (see Figure 18.27) and is fraught with uncertainties. On one hand, a large percentage of interactions known to occur between specific proteins fail to be detected in these types of experiments. The Y2H assay, in other words, gives a significant number of false negatives. The Y2H assay is also known to generate a large number of false positives; that is, it indicates that two proteins are capable of interacting when it is known from other studies that they do not do so under normal conditions within cells. In the study of fruit-fly proteins described above, the authors used computer analyses to narrow the findings from the original 20,000 interactions to approximately 5000 interactions in which they had high confidence. Overall, it is estimated that, on average, each protein encoded in the genome of a eukaryotic organism interacts with about five different protein partners. According to this estimate, human proteins would engage in roughly 100,000 different interactions. The results from large-scale protein–protein interaction studies can be presented in the form of a network, such as that shown in Figure 2.41. This figure displays the potential binding partners of the various yeast proteins that contain an SH3 domain (see Figure 2.40a) and illustrates the complexities of such interactions at the level of an entire organism. In this particular figure, only those interactions detected by two entirely different assays (Y2H and a biochemical technique that involves purification and analysis of protein complexes) are displayed, which makes the conclusions much more reliable than those based on a single technology. Those proteins that have multiple binding partners, such as Las17 situated near the center of Figure 2.41, are referred to as hubs. Hub proteins are more likely than non-hub proteins to be essential proteins, that is, proteins that the organism cannot survive without. Some hub proteins have several different binding interfaces and are capable of binding a number of different binding partners at the same time. In contrast, other hubs have a single binding interface, which is capable of binding several different partners, but only one at a time. Examples of each of these

63 Vrp1

Bck1

Bni1

Prk1

Srv2

Bnr1

Pbs1

Ark1 Ydl146w

Fun21

Myo3

Fus1

Sho1

Yfr024c

Bzz1 Las17 Ypr171w Bbc1 Ygr136w

Sla1 Acf2

Yer158c Rvs167 Boi1

Ykl105c

Abp1

Myo5

Ydr409w

Yir003w

ously unknown. What do these proteins do? One approach to determining a protein’s function is to identify the proteins with which it associates. If, for example, a known protein has been shown to be involved in DNA replication, and an unknown protein is found to interact with the known protein, then it is likely that the unknown protein is also part of the cell’s DNA-replication machinery. Thus, regardless of their limitations, these large-scale Y2H studies (and others using different assays not discussed) provide the starting point to explore a myriad of previously unknown protein–protein interactions, each of which has the potential to lead investigators to a previously unknown biological process.

Yhl002w

Ubp7

Ygr268c Ysc84

Ynl09w Yjr083c

Ypr154w Ymr192w Yor197w

Protein Folding The elucidation of the tertiary structure of myoglobin in the late 1950s led to an appreciation for the complexity of protein architecture. An important question immediately arose: How does such a complex, folded, asymmetric organization arise in the cell? The first insight into this problem began with a serendipitous observation in 1956 by Christian Anfinsen at the National Institutes of Health. Anfinsen was studying the properties of ribonuclease A, a small enzyme that consists of a single polypeptide chain of 124 amino acids with 4 disulfide bonds linking various parts of the chain. The disulfide bonds of a protein are typically broken (reduced) by adding a reducing agent, such as mercaptoethanol, which converts each disulfide bridge to a pair of sulfhydryl (—SH) groups (see drawing, page 53). To make all of the disulfide bonds accessible to the reducing agent, Anfinsen found that the molecule had to first be partially unfolded. The unfolding or disorganization of a protein is termed denaturation, and it can be brought about by a variety of agents, including detergents, organic solvents, radiation, heat, and compounds such as urea and guanidine chloride, all of which interfere with the various interactions that stabilize a protein’s tertiary structure.

Ypl249c

Figure 2.41 A network of protein–protein interactions. Each red line represents an interaction between two yeast proteins, which are indicated by the named black dots. In each case, the arrow points from an SH3 domain protein to a target protein with which it can bind. The 59 interactions depicted here were detected using two different types of techniques that measure protein–protein interactions. (See Trends Biochem. Sci. 34:1, and 579, 2009, for discussion of the validity of protein–protein interaction studies.) (FROM A.H.Y. TONG, ET AL., SCIENCE 295:323, 2002, COPYRIGHT © 2002. REPRINTED WITH PERMISSION FROM AAAS.)

types of hub proteins are illustrated in Figure 2.42. The hub protein depicted in Figure 2.42a plays a central role in the process of gene expression, while that shown in Figure 2.42b plays an equally important role in the process of cell division. Aside from obtaining a lengthy list of potential interactions, what do we learn about cellular activities from these types of large-scale studies? Most importantly, they provide a guideline for further investigation. Genome-sequencing projects have provided scientists with the amino acid sequences of a huge number of proteins whose very existence was previ-

Figure 2.42 Protein–protein interactions of hub proteins. (a) The enzyme RNA polymerase II, which synthesizes messenger RNAs in the cell, binds a multitude of other proteins simultaneously using multiple interfaces. (b) The enzyme Cdc28, which phosphorylates other proteins as it regulates the cell division cycle of budding yeast.

(b)

Cdc28 binds a number of different proteins (Cln1-Cln3) at the same interface, which allows only one of these partners to bind at a time. (FROM DAMIEN DEVOS AND ROBERT B. RUSSELL, CURR. OPIN. STRUCT. BIOL. 17:373, © 2007, WITH PERMISSION OF ELSEVIER.)

2.5 Four Types of Biological Molecules

(a)

64

When Anfinsen treated ribonuclease molecules with mercaptoethanol and concentrated urea, he found that the preparation lost all of its enzymatic activity, which would be expected if the protein molecules had become totally unfolded. When he removed the urea and mercaptoethanol from the preparation, he found, to his surprise, that the molecules regained their normal enzymatic activity. The active ribonuclease molecules that had re-formed from the unfolded protein were indistinguishable both structurally and functionally from the correctly folded (i.e., native) molecules present at the beginning of the experiment (Figure 2.43). After extensive study, Anfinsen concluded that the linear sequence of amino acids contained all of the information required for the formation of the polypeptide’s three-dimensional conformation. Ribonuclease, in other words, is capable of self-assembly. As discussed in Chapter 3, events tend to progress toward states of lower energy. According to this concept, the tertiary structure that a polypeptide chain assumes after folding is the accessible structure with the lowest energy, which makes it the most thermodynamically stable structure that can be formed by that chain. It would appear that evolution selects for those amino acid sequences that generate a polypeptide chain capable of spontaneously arriving at a meaningful native state in a biologically reasonable time period. There have been numerous controversies in the study of protein folding. Many of these controversies stem from the fact that the field is characterized by highly sophisticated experimental, spectroscopic, and computational procedures that are required to study complex molecular events that typically occur on a microsecond timescale. These efforts have often yielded conflicting results and have generated data that are

open to more than one interpretation. For the sake of simplicity, we will restrict the discussion to “simple” proteins, such as ribonuclease, that consist of a single domain. One fundamental issue that has been extensively debated is whether all of the members of a population of unfolded proteins of a single species fold along a similar pathway or fold by means of a diverse set of routes that somehow converge upon the same native state. Recent studies that have simulated folding with the aid of specialized supercomputers appear to have swung the consensus toward the idea that single-domain proteins fold along a single dominant pathway characterized by a unique, relatively well-defined transition state (see Figure 2.45). Another issue that has been roundly debated concerns the types of events that occur at various stages during the folding process. In the course depicted in Figure 2.44a, protein folding is initiated by interactions among neighboring residues that lead to the formation of much of the secondary structure of the molecule. Once the ␣ helices and ␤ sheets are formed, subsequent folding is driven by hydrophobic interactions that bury nonpolar residues together in the central core of the protein. According to an alternate scheme shown in Figure 2.44b, the first major event in protein folding is the hydrophobic collapse of the polypeptide to form a compact structure in which the backbone adopts a native-like topology. Only after this collapse does significant secondary structure develop. Recent studies indicate that the two pathways depicted in Figure 2.44 lie at opposite extremes, and that most proteins probably fold by a middle-of-the-road scheme in which secondary structure formation and compaction occur simultaneously. These early

Collapse

Unfolding (urea + mercaptoethanol)

Unfolded

Secondary structure

Native

(a)

Chapter 2 The Chemical Basis of Life

Refolding

Collapse

Figure 2.43 Denaturation and refolding of ribonuclease. A native ribonuclease molecule (with intramolecular disulfide bonds indicated) is reduced and unfolded with ␤-mercaptoethanol and 8 M urea. After removal of these reagents, the protein undergoes spontaneous refolding. (FROM C. J. EPSTEIN, R. F. GOLDBERGER, AND C. B. ANFINSEN, COLD SPRING HARBOR SYMP. QUANT. BIOL. 28:439, 1963. REPRINTED WITH PERMISSION FROM COLD SPRING HARBOR LABORATORY PRESS.)

Unfolded

Secondary structure

Native

(b)

Figure 2.44 Two alternate pathways by which a newly synthesized or denatured protein could achieve its native conformation. Curled segments represent ␣ helices, and arrows represent ␤ strands.

65

Figure 2.45 Along the folding pathway. The image on the left shows the native tertiary structure of the enzyme acyl-phosphatase. The image on the right is the transition structure, which represents the state of the molecule at the top of an energy barrier that must be crossed if the protein is going to reach the native state. The transition structure consists of numerous individual lines because it is a set (ensemble) of closely related structures. The overall architecture of the transition structure is similar to that of the native protein, but many of the finer structural features of the fully folded protein have yet to emerge. Conversion of the transition state to the native protein involves completing secondary structure formation, tighter packing of the side chains, and finalizing the burial of hydrophobic side chains from the aqueous solvent. (FROM K. LINDORFF-LARSEN, ET AL, TRENDS BIOCHEM. SCI. 30:14, 2005, FIG. 1B. © 2005, WITH PERMISSION FROM ELSEVIER. IMAGE FROM CHRISTOPHER DOBSON.)

folding events lead to the formation of a partially folded, transient structure that resembles the native protein but lacks many of the specific interactions between amino acid side chains that are present in the fully folded molecule (Figure 2.45). If the information that governs folding is embedded in a protein’s amino acid sequence, then alterations in this sequence have the potential to change the way a protein folds, leading to an abnormal tertiary structure. In fact, many mutations responsible for inherited disorders have been found to alter a protein’s three-dimensional structure. In some cases, the consequences of protein misfolding can be fatal. Two examples of fatal neurodegenerative diseases that result from abnormal protein folding are discussed in the accompanying Human Perspective.

mRNA

Ribosome

Native polypeptide

Nascent polypeptide 1

Chaperone (Hsp70) 3

4

2

Figure 2.46 The role of molecular chaperones in encouraging protein folding. The steps are described in the text. (Other families of chaperones are known but are not discussed.)

Chaperonin (TRiC) 5

2.5 Four Types of Biological Molecules

mRNA

The Role of Molecular Chaperones Not all proteins are able to assume their final tertiary structure by a simple process of selfassembly. This is not because the primary structure of these proteins lacks the required information for proper folding, but rather because proteins undergoing folding have to be prevented from interacting nonselectively with other molecules in the crowded compartments of the cell. Several families of proteins have evolved whose function is to help unfolded or misfolded proteins achieve their proper three-dimensional conformation. These “helper proteins” are called molecular chaperones, and they selectively bind to short stretches of hydrophobic amino acids that tend to be exposed in nonnative proteins but buried in proteins having a native conformation. Figure 2.46 depicts the activities of two families of molecular chaperones that operate in the cytosol of eukaryotic cells. Molecular chaperones are involved in a multitude of activities within cells, ranging from the import of proteins into organelles (see Figure 8.47a) to the prevention and reversal of protein aggregation. We will restrict the discussion to their actions on newly synthesized proteins. Polypeptide chains are synthesized on ribosomes by the addition of amino acids, one at a time, beginning at the chain’s N-terminus (step 1, Figure 2.46). Chaperones of the Hsp70 family bind to elongating polypeptide chains as they emerge from an exit channel within the large subunit of the ribosome (step 2). Hsp70 chaperones are thought to prevent these partially formed polypeptides (i.e., nascent polypeptides) from binding to other proteins in the cytosol, which would cause them either to aggregate or misfold. Once their synthesis has been completed (step 3), many of these proteins are simply released by the chaperones into the cytosol where they spontaneously fold into their native state (step 4). Other proteins are repeatedly bound and released by chaperones until they finally reach their fully folded state. Many of the larger polypeptides are transferred from Hsp70 proteins to a different type of chaperone called a chaperonin (step 5). Chaperonins are cylindrical protein complexes that contain chambers in which newly synthesized polypeptides can fold without interference from other macromolecules in the cell. TRiC is a chaperonin thought to assist in the folding of up to 15 percent of the polypeptides synthesized in mammalian cells. The discovery and mechanism of action of Hsp70 and chaperonins are discussed in depth in the Experimental Pathways on page 80.

66

T H E

H U M A N

P E R S P E C T I V E

Chapter 2 The Chemical Basis of Life

Protein Misfolding Can Have Deadly Consequences In April 1996 a paper was published in the medical journal Lancet that generated widespread alarm in the populations of Europe. The paper described a study of 10 persons afflicted with Creutzfeld-Jakob disease (CJD), a rare, fatal disorder that attacks the brain, causing a loss of motor coordination and dementia. Like numerous other diseases, CJD can occur as an inherited disease that runs in certain families or as a sporadic form that appears in individuals who have no family history of the disease. Unlike virtually every other inheritable disease, however, CJD can also be acquired. Until recently, persons who had acquired CJD had been recipients of organs or organ products that were donated by a person with undiagnosed CJD. The cases described in the 1996 Lancet paper had also been acquired, but the apparent source of the disease was contaminated beef that the infected individuals had eaten years earlier. The contaminated beef was derived from cattle raised in England that had contracted a neurodegenerative disease that caused the animals to lose motor coordination and develop demented behavior. The disease became commonly known as “mad cow disease.” Patients who have acquired CJD from eating contaminated beef can be distinguished by several criteria from those who suffer from the classical forms of the disease.To date, roughly 200 people have died of CJD acquired from contaminated beef, and the numbers of such deaths have been declining.1 A disease that runs in families can invariably be traced to a faulty gene, whereas diseases that are acquired from a contaminated source can invariably be traced to an infectious agent. How can the same disease be both inherited and infectious? The answer to this question has emerged gradually over the past several decades, beginning with observations by D. Carleton Gajdusek in the 1960s on a strange malady that once afflicted the native population of Papua, New Guinea. Gajdusek showed that these islanders were contracting a fatal neurodegenerative disease—which they called “kuru”—during a funeral ritual in which they ate the brain tissue of a recently deceased relative. Autopsies of the brains of patients who had died of kuru showed a distinct pathology, referred to as spongiform encephalopathy, in which certain brain regions were riddled with microscopic holes (vacuolations), causing the tissue to resemble a sponge. It was soon shown that the brains of islanders suffering from kuru were strikingly similar in microscopic appearance to the brains of persons afflicted with CJD. This observation raised an important question: Did the brain of a person suffering from CJD, which was known to be an inherited disease, contain an infectious agent? In 1968, Gajdusek showed that when extracts prepared from a biopsy of the brain of a person who had died from CJD were injected into a suitable laboratory animal, that animal did indeed develop a spongiform encephalopathy similar to that of kuru or CJD. Clearly, the extracts contained an infectious agent, which at the time was presumed to be a virus. In 1982, Stanley Prusiner of the University of California, San Francisco, published a paper suggesting that, unlike viruses, the infectious agent responsible for CJD lacked nucleic acid and instead was 1

On the surface, this would suggest that the epidemic has run its course, but there are several reasons for public health officials to remain concerned. For one, studies of tissues that had been removed during surgeries in England indicate that thousands of people are likely to be infected with the disease without exhibiting symptoms (discussed in Science 335:411, 2012). Even if these individuals never develop clinical disease, they remain potential carriers who could pass CJD on to others through blood transfusions. In fact, at least two individuals are believed to have contracted CJD after receiving blood from a donor harboring the disease. These findings underscore the need to test blood for the presence of the responsible agent (whose nature is discussed below).

composed solely of protein. He called the protein a prion. This “protein only” hypothesis, as it is called, was originally met with considerable skepticism, but subsequent studies by Prusiner and others have provided overwhelming support for the proposal. It was presumed initially that the prion protein was an external agent—some type of virus-like particle lacking nucleic acid. Contrary to this expectation, the prion protein was soon shown to be encoded by a gene (called PRNP) within the cell’s own chromosomes.The gene is expressed within normal brain tissue and encodes a protein designated PrPC (standing for prion protein cellular) that resides at the surface of nerve cells. The precise function of PrPC remains a mystery. A modified version of the protein (designated PrPSc, standing for prion protein scrapie) is present in the brains of humans with CJD. Unlike the normal PrPC, the modified version of the protein accumulates within nerve cells, forming aggregates that kill the cells. In their purified states, PrPC and PrPSc have very different physical properties. PrPC remains as a monomeric molecule that is soluble in salt solutions and is readily destroyed by protein-digesting enzymes. In contrast, PrPSc molecules interact with one another to form insoluble fibrils that are resistant to enzymatic digestion. Based on these differences, one might expect these two forms of the PrP protein to be composed of distinctly different sequences of amino acids, but this is not the case. The two forms can have identical amino acid sequences, but they differ in the way the polypeptide chain folds to form the three-dimensional protein molecule (Figure 1). Whereas a PrPC molecule consists largely of -helical segments and interconnecting coils, the core of a PrPSc molecule consists largely of sheet. It is not hard to understand how a mutant polypeptide might be less stable and more likely to fold into the abnormal PrPSc conformation, but how is such a protein able to act as an infectious agent? According to the prevailing hypothesis, an abnormal prion molecule (PrPSc) can bind to a normal protein molecule (PrPC) and cause the normal protein to fold into the abnormal form. This conversion can be shown to occur in the test tube: addition of PrPSc to a preparation of PrPC can convert the PrPC molecules into the PrPSc conformation. According to this hypothesis, the appearance of the abnormal protein in the body—whether as a result of a rare misfolding event in the case of sporadic disease or by exposure to contaminated beef—starts a chain reaction in which normal protein molecules in the cells are gradually converted to the misshapen prion form as they are recruited into growing insoluble fibrils. The precise mechanism by which prions lead to neurodegeneration remains unclear. CJD is a rare disease caused by a protein with unique infective properties. Alzheimer’s disease (AD), on the other hand, is a common disorder that strikes as many as 10 percent of individuals who are at least 65 years of age and perhaps 40 percent of individuals who are 80 years or older. Persons with AD exhibit memory loss, confusion, and a loss of reasoning ability. CJD and AD share a number of important features. Both are fatal neurodegenerative diseases that can occur in either an inherited or sporadic form. Like CJD, the brain of a person with Alzheimer’s disease contains fibrillar deposits of an insoluble material referred to as amyloid (Figure 2).2 In both 2

It should be noted that the term amyloid is not restricted to the abnormal protein found in AD. Many different proteins are capable of assuming an abnormal conformation that is rich in -sheet, which causes the protein monomers to aggregate into characteristic amyloid fibrils that bind certain dyes. Amyloid fibrils are defined by their molecular structure in which the -strands are oriented perpendicular to the long axis of the fibrils. The PrPSc-forming fibrils of prion diseases are also described as amyloid.

67

(b)

(a)

Figure 1 A contrast in structure. (a) Tertiary structure of the normal (PrPC) protein as determined by NMR spectroscopy. The orange portions represent ␣-helical segments, and the blue portions are short ␤ strands. The yellow dotted line represents the N-terminal portion of the polypeptide, which lacks defined structure. (b) A proposed model of the abnormal, infectious (PrPSc) prion protein, which consists largely of ␤-sheet. The actual tertiary structure of the prion protein has not been determined. The two molecules shown in this figure are formed by polypeptide chains that can be identical in amino acid sequence but

fold very differently. As a result of the differences in folding, PrPC remains soluble, whereas PrPSc produces aggregates that kill the cell. (The two molecules shown in this figure are called conformers because they differ only in conformation.) (A: FROM R. RIEK, ET AL., FEBS 413, 287, FIG. 1. © 1997, WITH PERMISSION FROM ELSEVIER. IMAGE FROM KURT WÜTHRICH. B: REPRINTED FROM S. B. PRUSINER, TRENDS BIOCHEM. SCI. 21:483, 1996 COPYRIGHT 1996, WITH PERMISSION FROM ELSEVIER.)

diseases, the fibrillar deposits result from the self-association of a polypeptide composed predominantly of ␤ sheet. There are also many basic differences between the two diseases: the proteins that form the disease-causing aggregates are unrelated, the parts of the brain that are affected are distinct, and the protein responsible for AD is not considered to be an infectious agent (i.e., it does not spread in a contagious pattern from one person to another, although it may spread from cell to cell within the brain).

Over the past two decades, research on AD has been dominated by the amyloid hypothesis, which contends that the disease is caused by the production of a molecule, called the amyloid ␤-peptide (A␤). A␤ is originally part of a larger protein called the amyloid precursor protein (APP), which spans the nerve cell membrane. The A␤ peptide is released from the APP molecule following cleavage by two specific enzymes, ␤-secretase and ␥-secretase (Figure 3). The length of the A␤ peptide is somewhat variable. The predominant species

Normal

Alzheimer’s Neurofibrillary tangles

Neuron

Neurofibrillary tangle (NFT)

(a)

Figure 2 Alzheimer’s disease. (a) The defining characteristics of brain tissue from a person who died of Alzheimer’s disease. (b) Amyloid plaques containing aggregates of the A␤ peptide appear extracellularly (between nerve cells), whereas neurofibrillary tangles (NFTs) appear within the cells themselves. NFTs, which are discussed at the end of the Human

Perspective, are composed of misfolded tangles of a protein called tau that is involved in maintaining the microtubule organization of the nerve cell. Both the plaques and tangles have been implicated as a cause of the disease. (A: © THOMAS DEERINCK, NCMIR/PHOTO RESEARCHERS, INC. B: © AMERICAN HEALTH ASSISTANCE FOUNDATION.)

2.5 Four Types of Biological Molecules

Amyloid plaque

Amyloid plaques

68 Lumen of endoplastic reticulum (becomes extracellular space)

NH2

Amyloid precursor protein (APP)

Aβ40 peptide

γ−Secretase

Aβ peptide

COOH

β−Secretase

Aβ42 peptide Cytoplasm

Chapter 2 The Chemical Basis of Life

Figure 3 Formation of the A␤ peptide. The A peptide is carved from the amyloid precursor protein (APP) as the result of cleavage by two enzymes, -secretase and -secretase. It is interesting that APP and the two secretases are all proteins that span the membrane. Cleavage of APP occurs inside the cell (probably in the endoplasmic reticulum), and the A product is ultimately secreted into the space outside of the cell. The -secretase can cut at either of two sites in the APP molecule, producing either A 40 or A 42 peptides, the latter of which is primarily responsible for production of the amyloid plaques seen in Figure 2. -Secretase is a multisubunit enzyme that cleaves its substrate at a site within the membrane.

has a length of 40 amino acids (designated as A 40), but a minor species with two additional hydrophobic residues (designated as A 42) is also produced. Both of these peptides can exist in a soluble form that consists predominantly of helices, but A 42 has a tendency to spontaneously refold into a very different conformation that contains considerable -pleated sheet. It is the misfolded A 42 version of the molecule that has the greatest potential to cause damage to the brain. A 42 tends to self-associate to form small complexes (oligomers) as well as large aggregates that are visible as fibrils in the electron microscope. These amyloid fibrils are deposited outside of the nerve cells in the form of extracellular amyloid plaques (Figure 2). Although the issue is far from settled, a body of evidence suggests that it is the soluble oligomers that are most toxic to nerve cells, rather than the insoluble aggregates. Cultured nerve cells, for example, are much more likely to be damaged by the presence of soluble intracellular A oligomers than by either A monomers or extracellular fibrillar aggregates. In the brain, the A oligomers appear to attack the synapses that connect one nerve cell to another and eventually lead to the death of the nerve cells. Persons who suffer from an inherited form of AD carry a mutation that leads to an increased production of the A 42 peptide. Overproduction of A 42 can be caused by possession of extra copies (duplications) of the APP gene, by mutations in the APP gene, or by mutations in genes (PSEN1, PSEN2) that encode subunits of -secretase. Individuals with such mutations exhibit symptoms of the disease at an early age, typically in their 50s. The fact that all mutations associated with these inherited, early-onset forms of AD lead to increased production of A 42 is the strongest argument favoring amyloid formation

as the underlying basis of the disease. The strongest argument against the amyloid hypothesis is the weak correlation that can exist between the number and size of amyloid plaques in the brain and the severity of the disease. Elderly persons who show little or no sign of memory loss or dementia can have relatively high levels of amyloid deposits in their brain and those with severe disease can have little or no amyloid deposition. All of the drugs currently on the market for the treatment of AD are aimed only at management of symptoms; none has any effect on stopping disease progression. With the amyloid hypothesis as the guiding influence, researchers have followed three basic strategies in the pursuit of new drugs for the prevention and/or reversal of mental decline associated with AD. These strategies are (1) to prevent the formation of the A 42 peptide in the first place; (2) to remove the A 42 peptide (or the amyloid deposits it produces) once it has been formed; and (3) to prevent the interaction between A molecules, thereby preventing the formation of both oligomers and fibrillar aggregates. Before examining each of these strategies, we can consider how investigators can determine what type of drugs might be successful in the prevention or treatment of AD. One of the best approaches to the development of treatments for human diseases is to find laboratory animals, particularly mice, that develop similar diseases, and use these animals to test the effectiveness of potential therapies. Animals that exhibit a disease that mimics a human disease are termed animal models. For whatever reason, the brains of aging mice show no evidence of the amyloid deposits found in humans, and, up until 1995, there was no animal model for AD. Then, in that year, researchers found that they could create a strain of mice that developed amyloid plaques in their brain and performed poorly at tasks that required memory. They created this strain by genetically engineering the mice to carry a mutant human APP gene, one responsible for causing AD in families. These genetically engineered (transgenic) mice have proven invaluable for testing potential therapies for AD. The greatest excitement in the field of AD therapeutics has centered on the second strategy mentioned above, and we can use these investigations to illustrate some of the steps required in the development of a new drug. In 1999, Dale Schenk and his colleagues at Elan Pharmaceuticals published an extraordinary finding. They had discovered that the formation of amyloid plaques in mice carrying the mutant human APP gene could be blocked by repeatedly injecting the animals with the very same substance that causes the problem, the aggregated A 42 peptide. In effect, the researchers had immunized (i.e., vaccinated) the mice against the disease. When young (6-week-old) mice were immunized with A 42, they failed to develop the amyloid brain deposits as they grew older. When older (13-month-old) mice whose brains already contained extensive amyloid deposits were immunized with the A 42, a significant fraction of the fibrillar deposits was cleared out of the nervous system. Even more importantly, the immunized mice performed better than their nonimmunized littermates on memory-based tests. The dramatic success of these experiments on mice, combined with the fact that the animals showed no ill effects from the immunization procedure, led government regulators to quickly approve a Phase I clinical trial of the A 42 vaccine. A Phase I clinical trial is the first step in testing a new drug or procedure in humans and usually comes after years of preclinical testing on cultured cells and animal models. Phase I tests are carried out on a small number of subjects and are designed to monitor the safety of the therapy and the optimal dose of the drug rather than its effectiveness against the disease. None of the subjects in two separate Phase I trials of the A vaccine showed any ill-effects from the injection of the amyloid peptide. As a result, the investigators were allowed to proceed to a Phase II clinical trial, which involves a larger group of subjects and is

69 designed to obtain a measure of the effectiveness of the procedure (or drug). This particular Phase II trial was carried out as a randomized, double-blind, placebo-controlled study. In this type of study: 1. the patients are randomly divided into two groups that are treated similarly except that one group is given the curative factor (protein, antibodies, drugs, etc.) being investigated and the other group is given a placebo (an inactive substance that has no therapeutic value), and 2. the study is double-blinded, which means that neither the researchers nor patients know who is receiving treatment and who is receiving the placebo.

Healthy brain

Alzheimer’s disease

Figure 4 A neuroimaging technique that reveals the presence of amyloid in the brain. These PET (positron emission tomography) scans show the brains of two individuals that have ingested a radioactive compound, called Amyvid, that binds to amyloid deposits and appears red in the image. The top shows a healthy brain and the bottom a brain from a patient with AD, revealing extensive amyloid build-up. Amyloid deposits in the brain can be detected with this technique in persons who show no evidence of cognitive dysfunction. Such symptom-free individuals are presumed to be at high risk of going on to develop AD. Those who lack such deposits can be considered at very low risk of the disease in the near future. (COURTESY OF ELI LILLY/AVID PHARMACEUTICALS.)

2.5 Four Types of Biological Molecules

The Phase II trial for the A vaccine began in 2001 and enrolled more than 350 individuals in the United States and Europe who had been diagnosed with mild to moderate AD. After receiving two injections of synthetic -amyloid (or a placebo), 6 percent of the subjects experienced a potentially life-threatening inflammation of the brain. Most of these patients were successfully treated with steroids, but the trial was discontinued. Once it had become apparent that vaccination of patients with A 42 had inherent risks, it was decided to pursue a safer form of immunization therapy, which is to administer antibodies directed against A that have been produced outside the body. This type of approach is known as passive immunization because the person does not produce the therapeutic antibodies themselves. Passive immunization with an anti-A 42 antibody (called bapineuzumab) had already proven capable of restoring memory function in transgenic mice and was quickly shown to be safe, and apparently effective, in Phase I and II clinical trials. The last step before government approval is a Phase III trial, which typically employs large numbers of subjects (a thousand or more at several research centers) and compares the effectiveness of the new treatment against standard approaches. The first results of the Phase III trials on bapineuzumab were reported in 2008 and were disappointing; there was little or no evidence that the antibody provided benefits in preventing the progression of the disease. Despite these findings, a large number of different antibodies targeting amyloid peptides are currently in clinical trials. Given the impact of AD on human health and the large amount of money that could be earned from this type of drug, pharmaceutical companies are willing to take the risk that one of these immunologic strategies will exhibit some therapeutic value. Meanwhile, a comprehensive analysis of some of the patients who had been vaccinated with A 42 in the original immunization trial from 2001 was also reported in 2008. Analysis of this patient group indicated that the A 42 vaccination had had no effect on preventing disease progression. It was particularly striking that in several of these patients who had died of severe dementia, there were virtually no amyloid plaques left in their brains. This finding strongly suggests that removal of amyloid deposits in a patient already suffering the symptoms of mild-to-moderate dementia does not stop disease progression. These results can be interpreted in more than one way. One interpretation is that the amyloid deposits are not the cause of the symptoms of dementia. An alternate interpretation is that irreversible toxic effects of the deposits had already occurred by the time immunization had begun and it was too late to reverse the disease course using treatments that remove existing amyloid deposits. It is important to note, in this regard, that the formation of amyloid deposits in the brain begins 10 or more years before any clinical symptoms of AD are reported. It is possible that, if these treatments had started earlier, the symptoms of the disease might never have appeared. Recent advances in brain-imaging procedures now allow clinicians to observe amyloid deposits in the brains of individuals long before any symptoms of AD have developed (Figure 4). Based on these studies, it may be possible to begin preventive treatments in

persons who are at very high risk of developing AD before they develop symptoms. The first clinical trial of this type was begun in 2012 as several hundred individuals who would normally be destined to develop early-onset AD (due to mutations in the PSEN1 gene) were treated with an anti-A antibody in the hopes of blocking the future buildup of amyloid and preventing the disease. Drugs have also been developed that follow the other two strategies outlined above. Alzhemed and scyllo-inositol are two small molecules that bind to A peptides and block molecular aggregation and fibril formation. Clinical trials have failed to demonstrate that either drug is effective in stopping disease progression in patients with mild to moderate AD. The third strategy outlined above is to stop production of A peptides. This can be accomplished by inhibiting either - or -secretase, because both enzymes are required in the pathway that cleaves the APP precursor to release the internal peptide (Figure 3). Pharmaceutical companies have had great difficulty developing a -secretase inhibitor that is both potent and small enough to enter the brain from the bloodstream. A number of potent -secretase inhibitors have been developed that block the production of all A peptides, both in cultured nerve cells and in transgenic AD mice. But there is a biological problem that has to be overcome with this class of inhibitors. In addition to cleaving APP,

-secretase activity is also required in a key signaling pathway involving a protein called Notch. Two of the most promising -secretase inhibitors, flurizan and semagacestat, have both failed to show

Chapter 2 The Chemical Basis of Life

70 any benefit in stopping AD progression. In addition to its lack of efficacy, semagacestat caused adverse side effects that were probably a result of blockade of the Notch pathway. The goal of drug designers is to develop a compound (e.g., begacestat) that blocks APP cleavage but does not interfere with cleavage of Notch. Taken collectively, the apparent failure of all of these drugs, aimed at various steps in the formation of A -containing aggregates and amyloid deposition, has left the field of AD therapeutics without a clear plan for the future. Some pharmaceutical companies are continuing to develop new drugs aimed at blocking the formation of amyloid aggregates, whereas others are moving in different directions. These findings also raise a more basic question: Is the A peptide even part of the underlying mechanism that leads to AD? It hasn’t been mentioned, but A is not the only misfolded protein found in the brains of persons with AD. Another protein called tau, which functions as part of a nerve cell’s cytoskeleton (Section 9.3), can develop into bundles of tangled cellular filaments called neurofibrillary tangles (or NFTs) (Figure 2) that interfere with the movement of substances down the length of the nerve cell. NFTs form when the tau molecules in nerve cells become excessively phosphorylated. Mutations in the gene that encodes tau have been found to cause a rare form of dementia (called FTD), which is characterized by the formation of NFTs. Thus, NFTs have been linked to dementia, but they have been largely ignored as a causative factor in AD pathogenesis, due primarily to the fact that the transgenic AD mouse models discussed above do not develop NFTs. If one extrapolates the results of these mouse studies to humans, they suggest that NFTs are not required for the cognitive decline that occurs in patients with AD. At the same time, however, autopsies of the brains of humans who died of AD suggest that NFT burden correlates

much better with cognitive dysfunction and neuronal loss than does the concentration of amyloid plaques. Given that mutations in genes in the A pathway are clearly a cause of AD, and yet it is the NFT burden that correlates with cognitive decline, it would appear that both A and NFTs must be involved in AD etiology. Many researchers believe that A deposition somehow leads to NFT formation, but the mechanism by which this might occur remains unknown. It is evident from this discussion that a great deal of work on AD has been based on transgenic mice carrying human AD genes. These animals have served as the primary preclinical subjects for testing AD drugs, and they have been used extensively in basic research that aims to understand the disease mechanisms responsible for the development of AD. But many questions have been raised as to how accurately these animal models mimic the disease in humans, particularly the sporadic human cases in which affected individuals lack the mutant genes that cause the animals to develop the corresponding disorder. In fact, one of the most promising new drugs at the time of this writing is one that acts on NFTs rather than -amyloid. In this case, the drug methylthioninium chloride (brand name Rember), which dissolves NFTs, was tested on a group of more than 300 patients with mild to moderate AD in a Phase II trial. The drug was found to reduce mental decline over a period of one year by an average of 81 percent compared to patients receiving a placebo. The drug is now being tested in a larger Phase III study, but no preliminary results have yet been reported. Other compounds that inhibit one of the enzymes (GSK-3) that adds phosphate groups to the tau protein are also being investigated as AD therapeutics. Clinical trials of one GSK-3 inhibitor, valproate, have been stopped due to adverse effects. (The results of studies on these and other treatments can be examined at www.alzforum.org/dis/tre/drc)

The Emerging Field of Proteomics With all of the attention on genome sequencing in recent years, it is easy to lose sight of the fact that genes are primarily information storage units, whereas proteins orchestrate cellular activities. Genome sequencing provides a kind of “parts list.” The human genome probably contains between 20,000 and 22,000 genes, each of which can potentially give rise to a variety of different proteins.6 To date, only a fraction of these molecules have been characterized. The entire inventory of proteins that is produced by an organism, whether human or otherwise, is known as that organism’s proteome. The term proteome is also applied to the inventory of all proteins that are present in a particular tissue, cell, or cellular organelle. Because of the sheer numbers of proteins that are currently being studied, investigators have sought to develop techniques that allow them to determine the properties or activities of a large number of proteins in a single experiment. A new term—proteomics—was coined to describe the expanding field of protein biochemistry. This term carries with it the concept that advanced technologies

and high-speed computers are used to perform large-scale studies on diverse arrays of proteins. This is the same basic approach that has proven so successful over the past decade in the study of genomes. But the study of proteomics is inherently more difficult than the study of genomics because proteins are more difficult to work with than DNA. In physical terms, one gene is pretty much the same as all other genes, whereas each protein has unique chemical properties and handling requirements. In addition, small quantities of a particular DNA segment can be expanded greatly using readily available enzymes, whereas protein quantities cannot be increased. This is particularly troublesome when one considers that many of the proteins regulating important cellular processes are present in only a handful of copies per cell. Traditionally, protein biochemists have sought to answer a number of questions about particular proteins. These include: What specific activity does the protein demonstrate in vitro, and how does this activity help a cell carry out a particular function such as cell locomotion or DNA replication? What is the protein’s three-dimensional structure? When does the protein appear in the development of the organism and in which types of cells? Where in the cell is it localized? Is the protein modified after synthesis by the addition of chemical groups (e.g., phosphates or sugars) and, if so, how does this modify its activity? How much of the protein is present, and how long does it survive before being degraded? Does the level of the protein change during physiologic activities or as

6

There are a number of ways that a single gene can give rise to more than one polypeptide. Two of the most prominent mechanisms, alternative splicing and posttranslational modification, are discussed in other sections of the text. It can also be noted that many proteins have more than one distinct function. Even myoglobin, which has long been studied as an oxygen-storage protein, has recently been shown to be involved in the conversion of nitric oxide (NO) to nitrate (NO3).

71

Figure 2.47 © JOSEPH G. SUTLIFF.

the result of disease? Which other proteins in the cell does it interact with? Biologists have been attempting to answer these questions for decades but, for the most part, they’ve been doing it one protein at a time. Proteomics researchers attempt to answer similar questions on a more comprehensive scale using large-scale (or high-throughput) techniques (Figure 2.47). We have already seen how a systematic approach can be used to investigate protein–protein interactions (page 62). Let us turn our attention to the seemingly daunting task of identifying the vast array of proteins produced by a particular cell. Figure 2.48 shows portions of two gels that were used to fractionate (separate) a mixture of proteins extracted from the same part of the brain of a human (Figure 2.48a) or a chimpanzee (Figure 2.48b). The proteins have been fractionated by a procedure called two-dimensional polyacrylamide gel elec-

Figure 2.48 The study of proteomics often requires the separation of complex mixtures of proteins. The two electrophoretic gels shown here contain proteins extracted from the frontal cortex of humans (a) or chimpanzees (b). The numbered spots represent homologous proteins that show distinct differences in the two gels as discussed in the text.

(b)

(Note: This is only a small portion of a much larger gel that contains approximately 8500 spots that represent proteins synthesized by a primate brain.) (Note: The animals in this study had died of natural causes.) (FROM W. ENARD ET AL., SCIENCE 296:342, FIGURES 3A, B. © 2002, REPRINTED WITH PERMISSION FROM AAAS.)

2.5 Four Types of Biological Molecules

(a)

trophoresis (described in detail in Section 18.7). The gels contain hundreds of different stained spots, each of which comprises a single protein, or at most a few proteins that have very similar physical properties and are therefore difficult to separate from one another. This technique, which was invented in the mid-1970s, was the first and is still one of the best ways to separate large numbers of proteins in a mixture. It is evident that virtually all of the spots on one gel have a counterpart in the other gel; those spots that are present in both gels correspond to homologous proteins shared by the two species. Certain spots are labeled by red numbers. Numbered spots correspond to proteins that exhibit differences in the two gels, either because (1) the proteins have migrated to a slightly different position (as the result of a difference in modification or amino acid sequence in the protein between the two organisms) (indicated by the double-headed arrows) or (2) the proteins are present in noticeably different amounts in the brains of the two organisms (indicated by the green arrows). In either case, this type of proteomic experiment provides an overview of the differences in protein expression in our brain compared to that of our closest living relative. It is one thing to separate proteins from one another, but quite another to identify each of the separated molecules. In the past few years, two technologies—mass spectrometry and high-speed computation—have come together to make it possible to identify any or all of the proteins present in a gel such as those shown in Figure 2.48. As discussed in further detail in Section 18.7, mass spectrometry is a technique to determine the precise mass of a molecule or fragment of a molecule, which can then be used to identify that molecule. Suppose that we wanted to identify the protein present in the spot that is indicated by the red number 1. The protein that makes up the spot in question can be removed from the gel and digested into peptides with the enzyme trypsin. When

72

Treat with trypsin Unknown protein

600

800

1000

1200

1400

1600

1884.7

1696.3

1542.3

1414.0

1211.8

1060.5

998.4

Analyze peptides by mass spectrometry

678.6 738.1 808.3

Relative intensity

848.7

893.3

Peptide fragments

1800

2000

m/z

Chapter 2 The Chemical Basis of Life

Figure 2.49 Identifying proteins by mass spectrometry. A protein is isolated from a source (such as one of the spots on one of the gels of Figure 2.48) and subjected to digestion by the enzyme trypsin. The peptide fragments are then introduced into a mass spectrometer where they are ionized and separated according to their mass/charge (m/z) ratio. The separated peptides appear as a pattern of peaks whose precise m/z ratio is indicated. A comparison of these ratios to those obtained by a theoretical digest of virtual proteins encoded by the genome allows researchers to identify the protein being studied. In this case, the MS spectrum is that of horse myoglobin lacking its heme group. (DATA REPRINTED FROM J. R. YATES, METHODS ENZYMOL. 271:353, 1996. COPYRIGHT 1996, WITH PERMISSION FROM ELSEVIER.)

these peptides are introduced into a mass spectrometer, they are converted into gaseous ions and separated according to their mass/charge (m/z) ratio. The results are displayed as a series of peaks of known m/z ratio, such as that shown in Figure 2.49. The pattern of peaks constitutes a highly characteristic peptide mass fingerprint of that protein. But how can a protein be identified based on its peptide mass fingerprint? The generation of peptide mass fingerprints is one element of the new proteomic technology. Another is based on advances in computer technology. Once a genome has been sequenced, the amino acid sequences of encoded proteins can be predicted. This list of “virtual proteins” can then be subjected to a theoretical trypsin digestion and the masses of the resulting virtual peptides calculated and entered into a database. Once this has been done, the actual peptide masses of a purified protein obtained by the mass spectrometer can be compared using high-speed supercomputers to the masses predicted by theoretical digests of all polypeptides encoded

by the genome. In most cases, the protein that has been isolated and subjected to mass spectrometry can be directly identified based on this type of database search. The protein labeled number 1 in the gels of Figure 2.48, for example, happens to be the enzyme aldose reductase. Mass spectrometers are not restricted to handling one purified protein at a time, but are also capable of analyzing proteins present in complex mixtures (Section 18.7). Mass spectrometry is particularly useful in revealing how the protein complement of a cell or tissue changes over time, as might occur, for example, after the secretion of a hormone within the body, or after taking a drug, or during a particular disease. Many clinical researchers believe that proteomics will play an important role in advancing the practice of medicine. It is thought that most human diseases leave telltale patterns (or biomarkers) among the thousands of proteins present in the blood or other bodily fluids. Many efforts have been made to compare the proteins present in the blood of healthy individuals with those present in the blood of persons suffering from various diseases, especially cancer. For the most part, the results of these biomarker searches have proven generally unreliable in that the findings of one research group cannot be duplicated by the efforts of other groups. The primary difficulty stems from the fact that human blood serum is such a complex solution containing thousands of proteins that range in abundance over 9 or 10 orders of magnitude. To date, the level of technical and computational sophistication has not been up to the daunting task at hand, but that may be changing. It is hoped that, one day, it will be possible to use a single blood test to reveal the existence of early-stage heart, liver, or kidney disease that can be treated before it becomes a life-threatening condition. Protein separation and mass spectrometric techniques don’t tell us anything about a protein’s function. Researchers have been working to devise techniques that allow protein function to be determined on a large scale, rather than one protein at a time. Several new technologies have been developed to accomplish this mission; we will consider only one— the use of protein microarrays (or protein chips). A protein microarray uses a solid surface, typically a glass microscope slide, which is covered by microscopic-sized spots, each containing an individual protein sample. Protein microarrays are constructed by the application of tiny volumes of individual proteins to specific sites on the slide, generating an array of proteins such as that shown in Figure 2.50a. The 6000 or so yeast proteins encoded by the yeast genome can fit comfortably as individual spots on a single glass slide. Once a protein microarray has been created, the proteins that it contains can be screened for various types of activities. Let’s consider the role of Ca2 ions for a moment. Ca2 ions play a key role in many activities, such as the formation of nerve impulses, the release of hormones into the blood, and muscle contraction. In each of these cases, Ca2 ions act by attaching to a calcium-binding protein. As discussed in Chapter 15, calmodulin is an important calcium-binding protein, even in single-celled yeast. Figure 2.50b shows a small portion of a yeast-protein microarray that had been incubated with the protein calmodulin in the presence of Ca2 ions. Those proteins in the array that display green fluorescence are ones that had bound calmodulin during the incubation and thus are

73

Figure 2.50 Global analysis of protein activities using protein chips. (a) This single microscope slide contains 5800 different yeast proteins spotted in duplicate. The proteins spotted on the slide were synthesized in genetically engineered cells. The spots display red fluorescence because they have been incubated with a fluorescent antibody that can bind to all of the proteins in the array. (b) The left image shows a small portion of the protein array depicted in part a.

The right image shows the same portion of the array following incubation with calmodulin in the presence of calcium ions. The two proteins exhibiting green fluorescence are calmodulin-binding proteins. (c) Schematic illustration of the events occurring in part b where calmodulin has bound to two proteins of the microarray that have complementary binding sites. (A,B: FROM H. ZHU, ET AL., SCIENCE 293:2101, 2001, COURTESY OF MICHAEL SNYDER. © 2001, REPRINTED WITH PERMISSION FROM AAAS.)

(a)

Calmodulin

(b)

(c)

? ing microarrays that contain antibodies capable of binding a number of different blood proteins. The presence and/or amount of these proteins would indicate that a person may be suffering from one or another of a wide variety of diseases. Protein Engineering Advances in molecular biology have created the opportunity to design and mass-produce novel proteins that are different from those made by living organisms. It is possible with current DNA-synthesizing techniques to create an artificial gene that can be used in the production of a protein having any desired sequence of amino acids. Polypeptides can also be synthesized from “scratch” using chemical techniques. This latter strategy allows researchers to incorporate building blocks other than the 20 amino acids that normally occur in nature. The problem with these types of engineering efforts is in knowing which of the virtually infinite variety of possible proteins one could manufacture might have some useful function.

2.5 Four Types of Biological Molecules

likely to be involved in calcium signaling activity. Thirty-three new calmodulin-binding proteins were discovered in this particular protein screening experiment. These types of studies open the door to the analysis of the properties of large numbers of proteins of unknown function. Protein chips are also expected to be used one day by clinical laboratories to screen for proteins that are characteristic of particular disorders. The simplest way to determine whether a particular protein characteristic of a disease is present in a blood or urine sample, and how much of that protein is present, is to measure the protein’s interaction with a specific antibody. This is the basis, for example, of the PSA test used in routine screening of men for prostate cancer. PSA is a protein that is found in the blood of normal men, but is present at elevated levels in individuals with prostate cancer. PSA levels are determined by measuring the amount of protein in the blood that binds to anti-PSA antibodies. It is expected in the near future that biotechnology companies will be manufactur-

74

Consider, for example, a pharmaceutical company that wanted to manufacture a therapeutic protein that would bind to the surface of the AIDS or influenza virus. Assume that computer simulation programs could predict the shape such a protein should have to bind to the viral surface. What sequence of amino acids strung together would produce such a protein? The answer requires detailed insight into the rules governing the complex relationship between a protein’s primary structure and its tertiary structure. Figure 2.51 illustrates that protein biochemists now have the knowledge that allows them to construct a protein capable of binding to the surface of another protein, in this case the hemagglutinin (HA) protein that was

Chapter 2 The Chemical Basis of Life

(a)

(b)

Figure 2.51 The computational design of a protein that is capable of binding specifically to the surface of another protein. (a) The computationally designed protein is shown in green and its target protein (the HA protein from the H1N1 1918 influenza virus) is shown on the left in multiple colors. The predicted structure of the designed protein fits closely with that of the actual protein that was generated from the predicted sequence. (b) The actual interface of the targeted hydrophobic helix of the HA protein (gray) and the designed protein (purple). Side chains of the designed protein are seen to interact with sites on the HA helix. (A: FROM SAREL J. FLEISHMAN, ET AL., SCIENCE 332:820, 2011, IMAGE COURTESY OF DAVID BAKER; (B) FROM BRYAN S. DER AND BRIAN KUHLMAN, SCIENCE 332:801, 2011, BOTH © 2011, REPRINTED WITH PERMISSION OF AAAS.)

present in the reconstructed 1918 influenza virus (page 25). Figure 2.51a shows the HA protein in proximity to a small engineered protein (in green). This engineered protein is capable of binding to a hydrophobic patch on the surface of the HA protein with high affinity. Figure 2.51b shows a closer view of the interface between the targeted portion of the HA protein (gray) and the binding surface of the designed protein (purple). It can be seen that side chains from the designed protein interact in highly specific ways with sites on the helix of HA. You might think that designing a protein from “scratch” that is capable of catalyzing a given chemical reaction—that is, designing an enzyme—would be far beyond the capability of present-day biotechnology. Given the magnitude of the task, it came as a surprise when researchers reported in 2008 that they had successfully designed and produced artificial proteins that were capable of catalyzing two different organic reactions, neither of which was catalyzed by any known natural enzyme. One of the reactions involved breaking a carbon–carbon bond, and the other the transfer of a proton from a carbon atom. These protein architects began by choosing a catalytic mechanism that might accelerate each chosen reaction and then used computer-based calculations to construct an idealized space in which amino acid side chains were positioned (forming an active site) to accomplish the task. They then searched among known protein structures to find ones that might serve as a framework or scaffold that could hold the active site they had designed. To transform the computer models into an actual protein, they used computational techniques to generate DNA sequences that had the potential to encode such a protein. The proposed DNA molecules were synthesized and introduced into bacterial cells where the proteins were manufactured. The catalytic activities of the proteins were then tested. Those proteins that showed the greatest promise were then subjected to a process of test-tube evolution; the proteins were mutated to create a new generation of altered proteins, which could in turn be screened for enhanced activity. Eventually, the team obtained proteins that could accelerate the rates of reaction as much as one million times that of the uncatalyzed reaction. While this is not a rate of enhancement that would fill a natural enzyme with pride, it is a remarkable accomplishment for a team of biochemists. It suggests, in fact, that scientists will ultimately be able to construct proteins from scratch that will be capable of catalyzing virtually any chemical reaction. An alternate approach to the production of novel proteins has been to modify those that are already produced by cells. Recent advances in DNA technology have allowed investigators to isolate an individual gene from human chromosomes, to alter its information content in a precisely determined way, and to synthesize the modified protein with its altered amino acid sequence. This technique, which is called site-directed mutagenesis (Section 18.17), has many different uses, both in basic research and in applied biology. If, for example, an investigator wants to know about the role of a particular residue in the folding or function of a polypeptide, the gene can be mutated in a way that substitutes an amino acid with different charge, hydrophobic character, or hydrogen-bonding properties. The effect of the substitution on the structure and function of the modified protein can then be determined. As we will see throughout this textbook, site-directed mutagenesis has

75 5b

+

Protein target

1

Compounds to be screened

Screen large

Make derivative with greater

chemical library

binding affinity using structurebased design

2 3

Identify compound that binds to target protein and inhibits its activity ( )

4

Preclinical testing

and 5

N

H

H

N

N

N N O

N

(b)

(c)

+

+

6

+

Verify compound binds to target protein with high affinity ( )

(a)

N

Clinical testing

5a

(d)

Structure-Based Drug Design The production of new proteins is one clinical application of recent advances in molecular biology; another is the development of new drugs that act by binding to known proteins, thereby inhibiting their activity. Drug companies have access to chemical “libraries” that contain millions of different organic compounds that have been either isolated from plants or microorganisms or chemically synthesized. One way to search for potential drugs is to expose the protein being targeted to combinations of these compounds and determine which compounds, if any, happen

to bind to the protein with reasonable affinity. An alternate approach, called structure-based drug design, relies upon knowledge of the structure of the protein target. If the tertiary structure of a protein has been determined, researchers can use computers to design “virtual” drug molecules whose size and shape might allow them to fit into the apparent cracks and crevices of the protein, rendering it inactive. We can illustrate both of these approaches by considering the development of the drug Gleevec, as depicted in Figure 2.52a. Introduction of Gleevec into the clinic has revolutionized the treatment of a number of relatively rare cancers, most notably that of chronic myelogenous leukemia (CML). As discussed at length in Chapters 15 and 16, a group of enzymes called tyrosine kinases are often involved in the transformation of normal cells into cancer cells. Tyrosine kinases catalyze a reaction in which a phosphate group is added to specific tyrosine residues within a target protein, an event that may activate or inhibit the target protein. The development of CML is driven almost single-handedly by the presence of an overactive tyrosine kinase called ABL.

2.5 Four Types of Biological Molecules

proven invaluable in the analysis of the specific functions of minute parts of virtually every protein of interest to biologists. Site-directed mutagenesis is also used to modify the structure of clinically useful proteins to bring about various physiological effects. For example, the drug Somavert, which was approved by the FDA in 2003, is a modified version of human growth hormone (GH) containing several alterations. GH normally acts by binding to a receptor on the surface of target cells, which triggers a physiological response. Somavert competes with GH in binding to the GH receptor, but interaction between drug and receptor fails to trigger the cellular response. Somavert is prescribed for the treatment of acromegaly, a disorder that results from excess production of growth hormone.

Figure 2.52 Development of a protein-targeting drug, such as Gleevec. (a) Typical steps in drug development. In step 1, a protein (e.g., ABL) has been identified that plays a causative role in the disease. This protein is a likely target for a drug that inhibits its enzymatic activity. In step 2, the protein is incubated with thousands of compounds in a search for ones that bind with reasonable affinity and inhibit its activity. In step 3, one such compound (e.g, 2-phenylaminopyrimidine in the case of ABL) has been identified. In step 4, knowledge of the structure of the target protein is used to make derivatives of the compound (e.g., Gleevec) that have greater binding affinity and thus can be used at lower concentrations. In step 5, the compound in question is tested in preclinical experiments for toxicity and efficacy (level of effectiveness) in vivo. Preclinical experiments are typically carried out on cultured human cells (step 5a) (e.g., those from patients with CML) and laboratory animals (step 5b) (e.g., mice carrying transplants of human CML cells). If the drug appears safe and effective in animals, the drug is tested in clinical trials (step 6) as discussed on page 68. (b) The structure of Gleevec. The blue portion of the molecule indicates the structure of the compound 2-phenylaminopyrimidine that was initially identified as an ABL kinase inhibitor. (c,d ) The structure of the catalytic domain of ABL in complex (c) with Gleevec (shown in yellow) and (d) with a secondgeneration inhibitor called Sprycel. Gleevec binds to the inactive conformation of the protein, whereas Sprycel binds to the active conformation. Both binding events block the activity that is required for the cell’s cancerous phenotype. Sprycel is effective against most cancer cells that have become resistant to the action of Gleevec. (C,D: FROM ELLEN WEISBERG ET AL., COURTESY OF JAMES D. GRIFFIN, NATURE REVS. CANCER 7:353, 2007 © 2007 REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

76

Chapter 2 The Chemical Basis of Life

During the 1980s researchers identified a compound called 2-phenylaminopyrimidine that was capable of inhibiting tyrosine kinases. This compound was discovered by randomly screening a large chemical library for compounds that exhibited this particular activity (Figure 2.52a). As is usually the case in these types of blind screening experiments, 2-phenylaminopyrimidine would not have made a very effective drug. For one reason, it was only a weak enzyme inhibitor, which meant it would have had to be used in very large quantities. 2-Phenylaminopyrimidine is described as a lead compound, a starting point from which usable drugs might be developed. Beginning with this lead molecule, compounds of greater potency and specificity were synthesized using structure-based drug design. One of the compounds to emerge from this process was Gleevec (Figure 2.52b), which was found to bind tightly to the inactive form of the ABL tyrosine kinase and prevent the enzyme from becoming activated, which is a necessary step if the cell is to become cancerous. The complementary nature of the interaction between the drug and its enzyme target is shown in Figure 2.52c. Preclinical studies demonstrated that Gleevec strongly inhibited the growth in the laboratory of cells from CML patients and that the compound showed no harmful effects in tests in animals. In the very first clinical trial of Gleevec, virtually all of the CML patients went into remission after taking once-daily doses of the compound. Gleevec has gone on to become the primary drug prescribed for treatment of CML, but this is not the end of the story. Many patients taking Gleevec experience a recurrence of their cancer as the ABL kinase becomes resistant to the drug. In such cases, the cancer can continue to be suppressed by treatment with more recently designed drugs that are capable of inhibiting Gleevec-resistant forms of the ABL kinase. One of these newer (second-generation) ABL kinase inhibitors is shown bound to the protein in Figure 2.52d. Protein Adaptation and Evolution Adaptations are traits that improve the likelihood that an organism will survive in a particular environment. Proteins are biochemical adaptations that are subject to natural selection and evolutionary change in the same way as other types of characteristics, such as eyes or skeletons. This is best revealed by comparing evolutionarily related (homologous) proteins in organisms living in very different environments. For example, the proteins of halophilic (saltloving) archaebacteria possess amino acid substitutions that allow them to maintain their solubility and function at very high cytosolic salt concentrations (up to 4 M KCl). Unlike its counterpart in other organisms, the surface of the halophilic version of the protein malate dehydrogenase, for example, is coated with aspartic and glutamic acid residues whose carboxyl groups can compete with the salt for water molecules (Figure 2.53). Homologous proteins isolated from different organisms can exhibit virtually identical shapes and folding patterns, but show strikingly divergent amino acid sequences. The greater the evolutionary distance between two organisms, the greater the difference in the amino acid sequences of their proteins. In some cases, only a few key amino acids located in a critical portion of the protein will be present in all of the organisms from which that protein has been studied. In one comparison of 226 globin sequences, only two residues were found to be absolutely

Figure 2.53 Distribution of polar, charged amino acid residues in the enzyme malate dehydrogenase from a halophilic archaebacterium. Red balls represent acidic residues, and blue balls represent basic residues. The surface of the enzyme is seen to be covered with acidic residues, which gives the protein a net charge of 156, and promotes its solubility in extremely salty environments. For comparison, a homologous protein from the dogfish, an ocean-dwelling shark, has a net charge of 16. (FROM O. DYM, M. MEVARECH, AND J. L. SUSSMAN, SCIENCE 267:1345, © 1995, REPRINTED WITH PERMISSION OF AAAS.)

conserved in all of these polypeptides; one is a histidine residue that plays a key role in the binding and release of O2. These observations indicate that the secondary and tertiary structures of proteins change much more slowly during evolution than their primary structures. This does not mean that the conformation of a protein cannot be affected in a major way by simple changes in primary structure. An example of such a change is shown in Figure 2.54. In this case, an amino acid substitution was experimentally introduced into a protein that completely altered the conformation of a small domain within a large protein molecule. The polypeptide on the left, which has a leucine at position 45, has a conformation consisting of a bundle of three helices, whereas the polypeptide on the right, which has a tyrosine at this position, has a conformation that contains a single helix and a four-stranded -sheet. If a mutation having an effect of this magnitude happened to occur in nature, it might result in the formation of a protein with new functional properties and thus could be responsible for generating the ancestral form of an entirely new family of proteins. We have seen how evolution has produced different versions of proteins in different organisms, but it has also produced different versions of proteins in individual organisms. Take a particular protein with a given function, such as globin or collagen. Several different versions of each of these proteins are encoded by the human genome. In most cases, different versions of a protein, which are known as isoforms, are adapted to function in different tissues or at different stages of development. For example, humans possess six different genes encoding isoforms of the contractile protein actin. Two of these isoforms are found in smooth muscle, one in skeletal muscle, one in heart muscle, and two in virtually all other types of cells.

77

look at the structure of DNA in Chapter 10, where it can be tied to its central role in the chemical basis of life. Each nucleotide in a strand of RNA consists of three parts (Figure 2.55a): (1) a five-carbon sugar, ribose; (2) a nitrogenous base (so called because nitrogen atoms form part of the rings of the molecule); and (3) a phosphate group. The Phosphate O–

O P O–

H

O

CH2 O

9

H

4'

H

H

H

N

N

A

4

1'

N1

3N

2

H Base

2'

OH

N H

5 6

H

3'

Figure 2.54 The dramatic effect on conformation that can result from a single amino acid substitution. In this case the switch between a leucine and a tyrosine at a critical position within this 56-amino acid polypeptide chain results in a transformation of the entire fold of the backbone of this polypeptide. This single substitution causes 85 percent of the amino acid residues to change their secondary structure. The spatial disposition of the two alternate side chains, which brings about this conformational shift, is shown in red in the model structures. The N-terminal amino acids are shown in orange and the C-terminal amino acids in blue. (FROM PATRICK A. ALEXANDER ET AL., COURTESY OF PHILIP N. BRYAN, PROC. NAT’L ACAD. SCI. U.S.A. 106:21153, 2009, FIG. 6. © 2009 NATIONAL ACADEMY OF SCIENCES.)

7

8

5'

OH

Sugar (a) Sugar phosphate backbone

O

O

P

O

–

O

CH

H

2

N

O H

N

H

H

H

O

O

P –

N

OH

H N

O

O

H OH

H

O

–

CH

2

H

O

H

O N

O –

O

N

H

O

H

P

U

H

H

O

OH

O CH

2

H O

H H

H

H

N N O

Nucleic acids are macromolecules constructed out of long chains (strands) of monomers called nucleotides. Nucleic acids function primarily in the storage and transmission of genetic information, but they may also have structural or catalytic roles. There are two types of nucleic acids found in living organisms, deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). DNA serves as the genetic material of all cellular organisms, though RNA carries out that role for many viruses. In cells, information stored in DNA is used to govern cellular activities through the formation of RNA messages. In the present discussion, we will examine the basic structure of nucleic acids using single-stranded RNA as the representative molecule. We will

H

N

N

H

H

(b)

Figure 2.55 Nucleotides and nucleotide strands of RNA. (a) Nucleotides are the monomers from which strands of nucleic acid are constructed. A nucleotide consists of three parts: a sugar, a nitrogenous base, and a phosphate. The nucleotides of RNA contain the sugar ribose, which has a hydroxyl group bonded to the second carbon atom. In contrast, the nucleotides of DNA contain the sugar deoxyribose, which has a hydrogen atom rather than a hydroxyl group attached to the second carbon atom. Each nucleotide is polarized, having a 5 end (corresponding to the 5 side of the sugar) and a 3 end. (b) Nucleotides are joined together to form a strand by covalent bonds that link the 3 hydroxyl group of one sugar with the 5 phosphate group of the adjoining sugar.

2.5 Four Types of Biological Molecules

N

G

0H

Nucleic Acids

H

N

H

O P

C

N

H

O

H

H

H

O

H

N

H

CH

2

As more and more amino acid sequences and tertiary structures of proteins were reported, it became apparent that most proteins are members of much larger families (or superfamilies) of related molecules. The genes that encode the various members of a protein family are thought to have arisen from a single ancestral gene that underwent a series of duplications during the course of evolution (see Figure 10.23). Over long periods of time, the nucleotide sequences of the various copies diverge from one another to generate proteins with related structures. Many protein families contain a remarkable variety of proteins that have evolved diverse functions. The expansion of protein families is responsible for much of the protein diversity encoded in the genomes of today’s complex plants and animals.

A

O

O

H

N

78 H H

O

N N

H N

N

O H Cytosine

NH H

Purines H H

Pyrimidines H NH

N N

N

CH3 H

O NH

H

H

O H Thymine

O

H

NH

N

N Adenine

N

H

NH

Guanine

H HN

N O

H Uracil

Figure 2.56 Nitrogenous bases in nucleic acids. Of the four standard bases found in RNA, adenine and guanine are purines, and uracil and cytosine are pyrimidines. In DNA, the pyrimidines are cytosine and thymine, which differs from uracil by a methyl group attached to the ring.

Chapter 2 The Chemical Basis of Life

sugar and nitrogenous base together form a nucleoside, so that the nucleotides of an RNA strand are also known as ribonucleoside monophosphates. The phosphate is linked to the 5 carbon of the sugar, and the nitrogenous base is attached to the sugar’s 1 carbon. During the assembly of a nucleic acid strand, the hydroxyl group attached to the 3 carbon of the sugar of one nucleotide becomes linked by an ester bond to the phosphate group attached to the 5 carbon of the next nu-

(a)

Figure 2.57 RNAs can assume complex shapes. (a) This ribosomal RNA is an integral component of the small ribosomal subunit of a bacterium. In this two-dimensional profile, the RNA strand is seen to be folded back on itself in a highly ordered pattern so that most of the molecule is double-stranded. (b) This hammerhead ribozyme, as it is

cleotide in the chain. Thus the nucleotides of an RNA (or DNA) strand are connected by sugar–phosphate linkages (Figure 2.55b), which are described as 3–5-phosphodiester bonds because the phosphate atom is esterified to two oxygen atoms, one from each of the two adjoining sugars. A strand of RNA (or DNA) contains four different types of nucleotides distinguished by their nitrogenous base. Two types of bases occur in nucleic acids: pyrimidines and purines (Figure 2.56). Pyrimidines are smaller molecules, consisting of a single ring; purines are larger, consisting of two rings. RNAs contain two different purines, adenine and guanine, and two different pyrimidines, cytosine and uracil. In DNA, uracil is replaced by thymine, a pyrimidine with an extra methyl group attached to the ring (Figure 2.56). Although RNAs consist of a continuous single strand, they often fold back on themselves to produce molecules having extensive double-stranded segments and complex three-dimensional structures. This is illustrated by the two RNAs shown in Figure 2.57. The RNA whose secondary structure is shown in Figure 2.57a is a component of the small subunit of the bacterial ribosome (see Figure 2.58). Ribosomal RNAs are not molecules that carry genetic information; rather, they serve as structural scaffolds on which the proteins of the ribosome can be attached and as elements that recognize and bind various soluble components required for protein synthesis. One of the ribosomal RNAs of the large subunit acts as the catalyst for the reaction by which amino acids are covalently joined during protein synthesis. RNAs having a catalytic role are called RNA enzymes, or ribozymes. Figure 2.57b depicts the tertiary

(b)

called, is a small RNA molecule from a viroid (page 26). The helical nature of the double-stranded portions of this RNA can be appreciated in this three-dimensional model of the molecule. (B: FROM WILLIAM G. SCOTT ET AL., CELL 81:993, © 1995, WITH PERMISSION FROM ELSEVIER.)

79

structure of the so-called hammerhead ribozyme, which is able to cleave its own RNA strand. In both examples shown in Figure 2.57, the double-stranded regions are held together by hydrogen bonds between the bases. This same principle is responsible for holding together the two strands of a DNA molecule. Nucleotides are not only important as building blocks of nucleic acids, they also have important functions in their own right. Most of the energy being put to use at any given moment in any living organism is derived from the nucleotide adenosine triphosphate (ATP). The structure of ATP and its key role in cellular metabolism are discussed in the following chapter. Guanosine triphosphate (GTP) is another nucleotide of enormous importance in cellular activities. GTP binds to a variety of proteins (called G proteins) and acts as a switch to turn on their activities (see Figure 11.49 for an example).

REVIEW

The most convincing evidence that a particular assembly process is self-directed is the demonstration that the assembly can occur outside the cell (in vitro) under physiologic conditions when the only macromolecules present are those that make up the final structure. In 1955, Heinz Fraenkel-Conrat and Robley Williams of the University of California, Berkeley, demonstrated that TMV particles, which consist of one long RNA molecule (approximately 6600 nucleotides) wound within a helical capsule made of 2130 identical protein subunits (see Figure 1.20), were capable of self-assembly. In their experiments, they purified TMV RNA and protein separately, mixed them together under suitable conditions, and recovered mature, infective particles after a short period of incubation. Clearly the two components contain all the information necessary for particle formation. Ribosomes, like TMV particles, are made of RNA and protein. Unlike the simpler TMV, ribosomes contain several different types of RNA and a considerable collection of different proteins. All ribosomes, regardless of their source, are composed of two subunits of different size. Although ribosomal subunits are often depicted in drawings as symmetric structures, in fact they have a highly irregular shape, as indicated in Figure 2.58. The large (or 50S) ribosomal subunit of bacteria contains two molecules of RNA and approximately 32 different proteins. The small (or 30S) ribosomal subunit of bacteria contains one molecule of RNA and 21 different proteins. The structure and function of the ribosome are discussed in detail in Section 11.8.

2.6 | The Formation of Complex Macromolecular Structures To what degree can the lessons learned from the study of protein architecture be applied to more complex structures in the cell? Can structures, such as membranes, ribosomes, and cytoskeletal elements, which consist of different types of subunits, also assemble by themselves? How far can subcellular organization be explained simply by having the pieces fit together to form the most stable arrangement? The assembly of cellular organelles is poorly understood, but it is apparent from the following examples that different types of subunits can self-assemble to form higher-order arrangements.

Figure 2.58 Reconstruction of a ribosome from the cytoplasm of a wheat germ cell. This reconstruction is based on high-resolution electron micrographs and shows the two subunits of this eukaryotic ribosome, the small (40S) subunit on the left and the large (60S) subunit on the right. The internal structure of a ribosome is discussed in Section 11.8. (FROM ADRIANA VERSCHOOR, ET AL., J. CELL BIOL. VOL. 133 (COVER #3), 1996; BY COPYRIGHT PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

2.6 The Formation of Complex Macromolecular Structures

1. Which macromolecules are polymers? What is the basic structure of each type of monomer? How do the various monomers of each type of macromolecule vary among themselves? 2. Describe the structure of nucleotides and the manner in which these monomers are joined to form a polynucleotide strand. Why would it be overly simplistic to describe RNA as a single-stranded nucleic acid? 3. Name three polysaccharides composed of polymers of glucose. How do these macromolecules differ from one another? 4. Describe the properties of three different types of lipid molecules. What are their respective biological roles? 5. What are the major properties that distinguish different amino acids from one another? What roles do these differences play in the structure and function of proteins? 6. What are the properties of glycine, proline, and cysteine that distinguish these amino acids? 7. How are the properties of an helix different from a strand? How are they similar? 8. Given that proteins act as molecular machines, explain why conformational changes are so important in protein function.

The Assembly of Tobacco Mosaic Virus Particles and Ribosomal Subunits

80

One of the milestones in the study of ribosomes came in the mid-1960s, when Masayasu Nomura and his co-workers at the University of Wisconsin succeeded in reconstituting complete, fully functional 30S bacterial subunits by mixing the 21 purified proteins of the small subunit with purified small-subunit ribosomal RNA. Apparently, the components of the small subunit contain all the information necessary for the assembly of the entire particle. Analysis of the intermediates that form at different stages during reconstitution in vitro indicates that subunit assembly occurs in a sequential step-by-step manner that closely parallels the process in vivo. At least one of the proteins of the small subunit (S16) appears to function solely in ribosome assembly; deletion of this protein from the reconstitution mixture greatly slowed the assembly process but did not block the formation of fully functional ribosomes. Reconstitution of the large subunit of the bacterial ribosome was accomplished in the following decade. It should be kept in mind that although it takes approximately 2 hours at 50⬚C to reconstitute the ribosome in vitro, the bacterium can assemble the same structure in a few minutes at temperatures as low as 10⬚C. It may be that the bacterium uses something that

is not available to the investigator who begins with purified components. Assembly of the ribosome within the cell, for example, may include the participation of accessory factors that function in protein folding, such as the chaperones described in the Experimental Pathways. In fact, the formation of ribosomes within a eukaryotic cell requires the transient association of many proteins that do not end up in the final particle, as well as the removal of approximately half the nucleotides of the large ribosomal RNA precursor (Section 11.3). As a result, the components of the mature eukaryotic ribosome no longer possess the information to reconstitute themselves in vitro.

REVIEW 1. What type of evidence suggests that bacterial ribosomal subunits are capable of self-assembly, but eukaryotic subunits are not? 2. What evidence would indicate that a particular ribosomal protein had a role in ribosome function but not assembly?

E X P E R I M E N TA L

P AT H W AY S

Chapter 2 The Chemical Basis of Life

Chaperones: Helping Proteins Reach Their Proper Folded State In 1962, F. M. Ritossa, an Italian biologist studying the development of the fruit-fly Drosophila, reported a curious finding.1 When the temperature at which fruit-fly larvae were developing was raised from the normal 25⬚C to 32⬚C, a number of new sites on the giant chromosomes of the larval cells became activated. As we will see in Chapter 10, the giant chromosomes of these insect larvae provide a visual exhibit of gene expression (see Figure 10.8). The results suggested that increased temperature induced the expression of new genes, a finding that was confirmed a decade later with the characterization of several proteins that appeared in larvae following temperature elevation.2 It was soon found that this response, called the heat-shock response, was not confined to fruit flies, but can be initiated in many different cells from virtually every type of organism— from bacteria to plants and mammals. Closer examination revealed that the proteins produced during the response were found not only in heat-shocked cells, but also at lower concentration in cells under normal conditions. What is the function of these so-called heat-shock proteins (hsps)? The answer to this question was gradually revealed by a series of seemingly unrelated studies. We saw on page 79 that some complex, multisubunit structures, such as a bacterial ribosome or a tobacco mosaic virus particle, can self-assemble from purified subunits. It was demonstrated in the 1960s that the proteins that make up bacteriophage particles (see Figure 1.21c) also possess a remarkable ability to self-assemble, but they are generally unable to form a complete, functional virus particle by themselves in vitro. Experiments on phage assembly in bacterial cells confirmed that phages require bacterial help. It was shown in 1973, for example, that a certain mutant strain of bacteria, called GroE, could not support the assembly of normal phages. Depending on the type of phage, the head or the tail of the phage particle was assembled incorrectly.3,4 These studies suggested that a protein encoded by the bacterial chromosome participated in the assembly of viruses, even though this host protein was not a component of the

final virus particles. As it obviously did not evolve as an aid for virus assembly, the bacterial protein required for phage assembly had to play some role in the cell’s normal activities, but the precise role remained obscure. Subsequent studies revealed that the GroE site on the bacterial chromosome actually contains two separate genes, GroEL and GroES, that encode two separate proteins GroEL and GroES. Under the electron microscope, the purified GroEL protein appeared as a cylindrical assembly consisting of two disks. Each disk was composed of seven subunits arranged symmetrically around the central axis (Figure 1).5,6 Several years later, a study on pea plants hinted at the existence of a similar assembly-promoting protein in the chloroplasts of plants.7 Rubisco is a large protein in chloroplasts that catalyzes the reaction in which CO2 molecules taken up from the atmosphere are covalently linked to organic molecules during photosynthesis (Section 6.6). Rubisco comprises 16 subunits: 8 small subunits (molecular mass of 14,000 daltons) and 8 large subunits (55,000 daltons). It was found that large Rubisco subunits, synthesized inside the chloroplast, are not present in an independent state, but are associated with a huge protein assembly consisting of identical subunits of 60,000 daltons (60 kDa) molecular mass. In their paper, the researchers considered the possibility that the complex formed by the large Rubisco subunits and the 60-kDa polypeptides was an intermediate in the assembly of a complete Rubisco molecule. A separate line of investigation on mammalian cells also revealed the existence of proteins that appeared to assist the assembly of multisubunit proteins. Like Rubisco, antibody molecules consist of a complex of two different types of subunits, smaller light chains and larger heavy chains. Just as the large subunits of Rubisco become associated with another protein not found in the final complex, so too do the heavy chains of an antibody complex.8 This protein, which associates with newly synthesized heavy chains, but not with heavy chains that are already bound to light chains, was named bind-

81

Figure 1 A model of the GroEL complex built according to data from electron microscopy and molecular-weight determination. The complex is seen to consist of two disks, each composed of seven identical subunits arranged symmetrically around a central axis. Subsequent studies showed the complex contains two internal chambers. (FROM T. HOHN ET AL., J. MOL. BIOL. 129:371, © 1979, WITH PERMISSION OF ELSEVIER.)

GroES

Figure 2 Reconstructions of GroEL based on high-resolution electron micrographs taken of specimens that had been frozen in liquid ethane and examined at ⫺170⬚C. The image on the left shows the GroEL complex, and that on the right shows the GroEL complex with GroES, which appears as a dome on one end of the cylinder. It is evident that the binding of the GroES is accompanied by a marked change in conformation of the apical end of the proteins that make up the top GroEL ring (arrow), which results in a marked enlargement of the upper chamber. (FROM S. CHEN ET AL., COURTESY OF HELEN R. SAIBIL, NATURE 371:263, © 1994, REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

Experimental Pathways

ing protein, or BiP. BiP was subsequently found to have a molecular mass of 70,000 daltons (70 kDa). To this point, we have been discussing two lines of investigation: one concerned with the heat-shock response and the other with proteins that promote protein assembly. These two fields came together in 1986, when it was shown that one of the proteins that figures most prominently in the heat-shock response, a protein that had been named heat-shock protein 70 (hsp70) because of its molecular mass, was identical to BiP, the protein implicated in the assembly of antibody molecules.9 Even before the discovery of the heat-shock response, the structure of proteins was known to be sensitive to temperature, with a small rise in temperature causing these delicate molecules to begin to unfold. Unfolding exposes hydrophobic residues that were previously buried in the protein’s core. Just as fat molecules in a bowl of soup are pushed together into droplets, so too are proteins with hydrophobic patches on their surface. Consequently, when a cell is heat shocked, soluble proteins become denatured and form aggregates. A report in 1985 demonstrated that, following temperature elevation, newly synthesized hsp70 molecules enter cell nuclei and bind to aggregates of nuclear proteins, where they act like molecular crowbars to promote disaggregation.10 Because of their role in assisting the assembly of proteins by preventing undesirable interactions, hsp70 and related molecules were named molecular chaperones.11 It was soon demonstrated that the bacterial heat-shock protein GroEL and the Rubisco assembly protein in plants were homologous proteins. In fact, the two proteins share the same amino acids at nearly half of the more than 500 residues in their respective molecules.12 The fact that the two proteins—both members of the Hsp60 chaperone family—have retained so many of the same amino acids reflects their similar and essential function in the two types of cells. But what was that essential function? At this point it was thought that their primary function was to mediate the assembly of multisubunit complexes, such as Rubisco. This view was changed in 1989 by experiments studying molecular chaperones in mitochondria by Arthur Horwich of Yale University and F.-Ulrich Hartl, Walter Neupert, and their colleagues at the University of Munich.13,14 It was known that newly made mitochondrial proteins produced in the

cytosol had to cross the outer mitochondrial membranes in an unfolded, extended, monomeric form. A mutant was found that altered the activity of another member of the Hsp60 chaperone family that resided inside mitochondria. In cells containing this mutant chaperone, proteins that were transported into mitochondria failed to fold into their active forms. Even proteins that consisted of a single polypeptide chain failed to fold into their native conformation. This finding changed the perception of chaperone function from a notion that they assist assembly of already-folded subunits into larger complexes, to our current understanding that they assist polypeptide chain folding within the crowded confines of the cell. The results of these and other studies indicated the presence in cells of at least two major families of molecular chaperones: the Hsp70 chaperones, such as BiP, and the Hsp60 chaperones (which are also called chaperonins), such as Hsp60, GroEL, and the Rubisco assembly protein. We will focus on the Hsp60 chaperonins, such as GroEL, which are best understood. As first revealed in 1979, GroEL is a huge molecular complex of 14 polypeptide subunits arranged in two stacked rings resembling a double doughnut.5,6 Fifteen years after these first electron micrographs were taken, the three-dimensional structure of the GroEL complex was determined by X-ray crystallography.15 The study revealed the presence of a central cavity within the GroEL cylinder. Subsequent studies demonstrated that this cavity was divided into two separate chambers. Each chamber was situated within the center of one of the rings of the GroEL complex and was large enough to enclose a polypeptide undergoing folding. Electron microscopic studies also provided information about the structure and function of a second protein, GroES, which acts in conjunction with GroEL. Like GroEL, GroES is a ring-like protein with seven subunits arrayed symmetrically around a central axis. GroES, however, consists of only one ring, and its subunits are much smaller (10,000 daltons) than those of GroEL (60,000 daltons). GroES is seen as a cap or dome that fits on top of either end of a GroEL cylinder (Figure 2). The attachment of GroES to one end of GroEL causes a dramatic conformational change in the GroEL

82 protein that markedly increases the volume of the enclosed chamber at that end of the complex.16 The importance of this conformational change has been revealed in remarkable detail by X-ray crystallographic studies in the laboratories of Arthur Horwich and Paul Sigler at Yale University.17 As shown in Figure 3, the binding of the GroES cap is accompanied by a 60⬚ rotation of the apical (red) domain of the subunits that make up the GroEL ring at that end of the GroEL cylinder. The attachment of GroES does more than trigger a conformational change that enlarges the GroEL chamber. Before attachment of GroES, the inner wall of the GroEL chamber has exposed hydrophobic residues that give the lining a hydrophobic character. Nonnative polypeptides also have exposed hydrophobic residues that become buried in the interior of the native polypeptide. Because hydrophobic surfaces tend to interact, the hydrophobic lining of the GroEL cavity binds to the surface of nonnative polypeptides. Binding of GroES to GroEL buries the hydrophobic residues of the GroEL wall and ex-

(a)

Chapter 2 The Chemical Basis of Life

(b)

Figure 3 Conformational change in GroEL. (a) The model on the left shows a surface view of the two rings that make up the GroEL chaperonin. The drawing on the right shows the tertiary structure of one of the subunits of the top GroEL ring. The polypeptide chain can be seen to fold into three domains. (b) When a GroES ring (arrow) binds to the GroEL cylinder, the apical domain of each GroEL subunit of the adjacent ring undergoes a dramatic rotation of approximately 60⬚ with the intermediate domain (shown in green) acting like a hinge. The effect of this shift in parts of the polypeptide is a marked elevation of the GroEL wall and enlargement of the enclosed chamber. (FROM Z. XU, A. L. HORWICH, AND P. B. SIGLER, NATURE 388:744, 1997. © 1997, REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS, LTD.)

poses a number of polar residues, thereby changing the character of the chamber wall. As a result of this change, a nonnative polypeptide that had been bound to the GroEl wall by hydrophobic interactions is displaced into the space within the chamber. Once freed from its attachment to the chamber wall, the polypeptide is given the opportunity to continue its folding in a protected environment. After about 15 seconds, the GroES cap dissociates from the GroEL ring, and the polypeptide is ejected from the chamber. If the polypeptide has not reached its native conformation by the time it is ejected, it can rebind to the same or another GroEL, and the process is repeated. A model depicting some of the steps thought to occur during GroEL-GroES-assisted folding is shown in Figure 4. Approximately 250 of the roughly 2400 proteins present in the cytosol of an E. coli cell normally interact with GroEL.18 How is it possible for a chaperone to bind so many different polypeptides? The GroEL binding site consists of a hydrophobic surface formed largely by two ␣ helices of the apical domain that is capable of binding virtually any sequence of hydrophobic residues that might be accessible in a partially folded or misfolded polypeptide.19 A comparison of the crystal structure of the unbound GroEL molecule with that of GroEL bound to several different peptides revealed that the binding site on the apical domain of a GroEL subunit can locally adjust its positioning when bound to different partners. This finding indicates that the binding site has structural flexibility that allows it to adjust its shape to fit the shape of the particular polypeptide with which it has to interact. A number of studies have also suggested that GroEL does more than simply provide a passive chamber in which proteins can fold without outside interference. In one study, site-directed mutagenesis was utilized to modify a key residue, Tyr71 of GroES, whose side chain hangs from the ceiling of the folding chamber.20 Because of its aromatic ring, tyrosine is a modestly hydrophobic residue (Figure 2.26). When Tyr71 was replaced by a positively or negatively charged amino acid, the resulting GroEL-GroES variant exhibited an increased ability to assist the folding of one specific foreign polypeptide, the green fluorescent protein (GFP). However, substitutions for Tyr71 that improved the ability of GroES-GroEL to increase GFP folding made the chaperonin less competent to help its natural substrates fold. Thus as the chaperonin became more and more specialized to interact with GFP, it lost its general ability to assist folding of proteins having an unrelated structure. This finding suggests that individual amino acids in the wall of the folding chamber may participate somehow in the folding reaction. Data from another study has suggested that binding of a nonnative protein to GroEL is followed by a forced unfolding of the substrate protein.21 FRET (fluorescence resonance energy transfer) is a technique (discussed in Section 18.1) that allows researchers to determine the distance between different parts of a protein molecule at different times during a given process. In this study, investigators found that the protein undergoing folding, in this case Rubisco, bound to the apical domain of the GroEL ring in a relatively compact state. The compact nature of the bound protein was revealed by the close proximity to one another of the FRET tags, which were attached to amino acids located at opposite ends of the Rubisco chain. Then, during the conformational change that enlarges the volume of the GroEL cavity (Figure 3), the bound Rubisco protein was forcibly unfolded, as evidenced by the increased distance between the two tagged ends of the molecule. This study suggests that the Rubisco polypeptide is taken completely back to the unfolded state, where it is given the opportunity to refold from scratch. This action should help prevent the nonnative protein from becoming trapped permanently in a misfolded state. In other words, each individual visit to a GroELGroES chamber provides an all-or-none attempt to reach the native

83 Polypeptide

GroES Native

or Misfolded

ADP GroEL 1

ATP

ATP

2

ADP ATP

ADP ATP

ATP

ATP

ATP

ATP

Time = 0

Polypeptide binding

3

Time = 15 sec.

GroES binding

Figure 4 A schematic illustration of the proposed steps that occur during the GroEL-GroES-assisted folding of a polypeptide. The GroEL is seen to consist of two chambers that have equivalent structures and functions and that alternate in activity. Each chamber is located within one of the two rings that make up the GroEL complex. The nonnative polypeptide enters one of the chambers (step 1) and binds to hydrophobic sites on the chamber wall. Binding of the GroES cap produces a conformational change in the wall of the top chamber, causing the enlargement of the chamber and release of the nonnative polypeptide from the wall into the encapsulated space (step 2). After about 15 seconds have elapsed, the

state, rather than just one stage in a series of steps in which the protein moves closer to the native state with each round of folding. Recent reviews of molecular chaperones can be found in References 22–23. Keep in mind that molecular chaperones do not convey information for the folding process but instead prevent proteins from veering off their correct folding pathway and finding themselves in misfolded or aggregated states. Just as Anfinsen discovered decades ago, the three-dimensional structure of a protein is determined by its amino acid sequence.

References

GroES Ejection

GroES dissociates from the complex and the polypeptide is ejected from the chamber (step 3). If the polypeptide has achieved its native conformation, as has the molecule on the left, the folding process is complete. If, however, the polypeptide is only partially folded, or is misfolded, it will rebind the GroEL chamber for another round of folding. (Note: As indicated, the mechanism of GroEL action is driven by the binding and hydrolysis of ATP, an energy-rich molecule whose function is discussed at length in the following chapter.) (A. L. HORWICH, ET AL., PROC. NAT ’L. ACAD. SCI. U.S.A. 96:11037, 1999.)

10.

11. 12. 13.

14. 15. 16. 17. 18. 19. 20. 21. 22. 23.

Identity with the 78 kD glucose-regulated protein and immunoglobin heavy chain binding protein. Cell 46:291–300. LEWIS, M. J. & PELHAM, H.R.B. 1985. Involvement of ATP in the nuclear and nucleolar functions of the 70kD heat-shock protein. EMBO J. 4:3137–3143. ELLIS, J. 1987. Proteins as molecular chaperones. Nature 328:378–379. HEMMINGSEN, S. M. ET AL. 1988. Homologous plant and bacterial proteins chaperone oligomeric protein assembly. Nature 333:330–334. CHENG, M. Y. ET AL. 1989. Mitochondrial heat-shock protein Hsp60 is essential for assembly of proteins imported into yeast mitochondria. Nature 337:620–625. OSTERMANN, J., ET AL. 1989. Protein folding in mitochondria requires complex formation with hsp60 and ATP hydrolysis. Nature 341: 125–130. BRAIG, K. ET AL. 1994. The crystal structure of the bacterial chaperonin GroEL at 2.8A. Nature 371:578–586. CHEN, S. ET AL. 1994. Location of a folding protein and shape changes in GroEL-GroES complexes. Nature 371:261–264. XU, Z., HORWICH, A. L., & SIGLER, P. B. 1997. The crystal structure of the asymmetric GroEL-GroES-(ADP)7 chaperonin complex. Nature 388:741–750. KERNER, M. J. ET AL. 2005. Proteome-wide analysis of chaperonin-dependent protein folding in Escherichia coli. Cell 122:209–220. CHEN, L. & SIGLER, P. 1999. The crystal structure of a GroEL/peptide complex: plasticity as a basis for substrate diversity. Cell 99:757–768. WANG, J. D. ET AL. 2002. Directed evolution of substrate-optimized GroEL/S chaperonins. Cell 111:1027–1039. LIN, Z. ET AL. 2008. GroEL stimulates protein folding through forced unfolding. Nature Struct. Mol. Biol. 15:303–311. ROTHMAN, J. E. & SCHEKMAN, R. 2011. Molecular mechanisms of protein folding in the cell. Cell 146:851–854. HARTL, F. U., ET AL., 2011. Molecular chaperones in protein folding and proteostasis. Nature 475:324–332.

Experimental Pathways

1. RITOSSA, F. 1962. A new puffing pattern induced by temperature shock and DNP in Drosophila. Experentia 18:571–573. 2. TISSIERES, A., MITCHELL, H. K., & TRACY, U. M. 1974. Protein synthesis in salivary glands of Drosophila melanogaster: Relation to chromosomal puffs. J. Mol. Biol. 84:389–398. 3. STERNBERG, N. 1973. Properties of a mutant of Escherichia coli defective in bacteriophage lambda head formation (groE). J. Mol. Biol. 76:1–23. 4. GEORGOPOULOS, C. P. ET AL. 1973. Host participation in bacteriophage lambda head assembly. J. Mol. Biol. 76:45–60. 5. HOHN, T. ET AL. 1979. Isolation and characterization of the host protein groE involved in bacteriophage lambda assembly. J. Mol. Biol. 129:359–373. 6. HENDRIX, R. W. 1979. Purification and properties of groE, a host protein involved in bacteriophage assembly. J. Mol. Biol. 129:375–392. 7. BARRACLOUGH, R. & ELLIS, R. J. 1980. Protein synthesis in chloroplasts. IX. Biochim. Biophys. Acta 608:19–31. 8. HAAS, I. G. & WABL, M. 1983. Immunoglobulin heavy chain binding protein. Nature 306:387–389. 9. MUNRO, S. & PELHAM, H.R.B. 1986. An Hsp70-like protein in the ER:

Folding

84

Chapter 2 The Chemical Basis of Life

| Synopsis Covalent bonds hold atoms together to form molecules. Covalent bonds are stable partnerships formed when atoms share their outershell electrons, each participant gaining a filled shell. Covalent bonds can be single, double, or triple depending on the number of pairs of shared electrons. If electrons in a bond are shared unequally by the component atoms, the atom with the greater attraction for electrons (the more electronegative atom) bears a partial negative charge, whereas the other atom bears a partial positive charge. Molecules that lack polarized bonds have a nonpolar, or hydrophobic, character, which makes them insoluble in water. Molecules that have polarized bonds have a polar, or hydrophilic, character, which makes them water soluble. Polar molecules of biological importance contain atoms other than just carbon and hydrogen, usually O, N, S, or P. (p. 33) Noncovalent bonds are formed by weak attractive forces between positively and negatively charged regions within the same molecule or between two nearby molecules. Noncovalent bonds play a key role in maintaining the structure of biological molecules and mediating their dynamic activities. Noncovalent bonds include ionic bonds, hydrogen bonds, and van der Waals forces. Ionic bonds form between fully charged positive and negative groups; hydrogen bonds form between a covalently bonded hydrogen atom (which bears a partial positive charge) and a covalently bonded nitrogen or oxygen atom (which bears a partial negative charge); van der Waals forces form between two atoms exhibiting a transient charge due to a momentary asymmetry in the distribution of electrons around the atoms. Nonpolar molecules or nonpolar portions of larger molecules tend to associate with one another in aqueous environments to form hydrophobic interactions. Examples of these various types of noncovalent interactions include the association of DNA and proteins by ionic bonds, the association of pairs of DNA strands by hydrogen bonds, and the formation of the hydrophobic core of soluble proteins as the result of hydrophobic interactions and van der Waals forces. (p. 34) Water has unique properties on which life depends. The covalent bonds that make up a water molecule are highly polarized. As a result, water is an excellent solvent capable of forming hydrogen bonds with virtually all polar molecules. Water is also a major determinant of the structure of biological molecules and the types of interactions in which they can engage. The pH of a solution is a measure of the concentration of hydrogen (or hydronium) ions. Most biological processes are acutely sensitive to pH because changes in hydrogen ion concentration alter the ionic state of biological molecules. Cells are protected from pH fluctuations by buffers—compounds that react with hydrogen or hydroxyl ions. (p. 37) Carbon atoms play a pivotal role in the formation of biological molecules. Each carbon atom is able to bond with up to four other atoms, including other carbon atoms. This property allows the formation of large molecules whose backbone consists of a chain of carbon atoms. Molecules consisting solely of hydrogen and carbon are called hydrocarbons. Most of the molecules of biological importance contain functional groups that include one or more electronegative atoms, making the molecule more polar, more water soluble, and more reactive. (p. 40) Biological molecules are members of four distinct types: carbohydrates, lipids, proteins, and nucleic acids. Carbohydrates include simple sugars and larger molecules (polysaccharides) constructed of sugar monomers. Carbohydrates function primarily as a storehouse of chemical energy and as durable building materials for biological construction. Simple biological sugars consist of a backbone of three to seven carbon atoms, with each carbon linked to a hydroxyl group except one, which bears a carbonyl. Sugars with five or more carbon atoms self-react to form a ring-shaped molecule. Those

carbon atoms along the sugar backbone that are linked to four different groups are sites of stereoisomerism, generating pairs of isomers that cannot be superimposed. The asymmetric carbon farthest from the carbonyl determines whether the sugar is D or L. Sugars are linked to one another by glycosidic bonds to form disaccharides, oligosaccharides, and polysaccharides. In animals, sugar is stored primarily as the branched polysaccharide glycogen, which provides a readily available energy source. In plants, glucose reserves are stored as starch, which is a mixture of unbranched amylose and branched amylopectin. Most of the sugars in both glycogen and starch are joined by (1 → 4) linkages. Cellulose is a structural polysaccharide manufactured by plant cells that serves as a major component of the cell wall. Glucose monomers in cellulose are joined by (1 → 4) linkages, which are cleaved by cellulase, an enzyme that is absent in virtually all animals. Chitin is a structural polysaccharide composed of N-acetylglucosamine monomers. (p. 42) Lipids are a diverse array of hydrophobic molecules having widely divergent structures and functions. Fats consist of a glycerol molecule esterified to three fatty acids. Fatty acids differ in chain length and the number and position of double bonds (sites of unsaturation). Fats are very rich in chemical energy; a gram of fat contains over twice the energy content of a gram of carbohydrate. Steroids are a group of lipids containing a characteristic four-ringed hydrocarbon skeleton. Steroids include cholesterol as well as numerous hormones (e.g., testosterone, estrogen, and progesterone) that are synthesized from cholesterol. Phospholipids are phosphate-containing lipid molecules that contain both a hydrophobic end and a hydrophilic end and play a pivotal role in the structure and function of cell membranes. (p. 47) Proteins are macromolecules of diverse function consisting of amino acids linked by peptide bonds into polypeptide chains. Included among the diverse array of proteins are enzymes, structural materials, membrane receptors, gene regulatory factors, hormones, transport agents, and antibodies. The order in which the 20 different amino acids are incorporated into a protein is encoded in the sequence of nucleotides in DNA. All 20 amino acids share a common structural organization consisting of an -carbon bonded to an amino group, a carboxyl group, and a side chain of varying structure. In the present scheme, the side chains are classified into four categories: those that are fully charged at physiologic pH; those that are polar, but uncharged and capable of forming hydrogen bonds; those that are nonpolar and interact by means of van der Waals forces; and three amino acids (proline, cysteine, and glycine) that possess unique properties. (p. 50) The structure of a protein can be described at four levels of increasing complexity. Primary structure is described by the amino acid sequence of a polypeptide; secondary structure by the three-dimensional structure (conformation) of sections of the polypeptide backbone; tertiary structure by the conformation of the entire polypeptide; and quaternary structure by the arrangement of the subunits if the protein consists of more than one polypeptide chain. The helix and pleated sheet are both stable, maximally hydrogen-bonded secondary structures that are common in many proteins. The tertiary structure of a protein is highly complex and unique to each individual type of protein. Most proteins have an overall globular shape in which the polypeptide is folded to form a compact molecule in which specific residues are strategically situated to allow the protein to carry out its specific function. Most proteins consist of two or more domains that maintain a structural and functional independence from one another. Using the technique of site-directed mutagenesis, researchers can learn about the role of specific amino acid residues by making specific substitutions. In recent years, a new field of proteomics has emerged that uses advanced technologies such as mass spectrometry and high-

85 speed computation to study various properties of proteins on a comprehensive, large scale. For example, the various interactions among thousands of proteins encoded by the fruit-fly genome have been analyzed by such large-scale techniques. (p. 54) The information required for a polypeptide chain to achieve its native conformation is encoded in its primary structure. Some proteins fold into their final conformation by themselves; others require the assistance of nonspecific chaperones, which prevent aggregation of partially folded intermediates. (p. 63) Nucleic acids are primarily informational molecules that consist of strands of nucleotide monomers. Each nucleotide in a strand

consists of a sugar, phosphate, and nitrogenous base. The nucleotides are linked by bonds between the 3 hydroxyl group of the sugar of one nucleotide and the 5 phosphate group of the adjoining nucleotide. Both RNA and DNA are assembled from four different nucleotides; nucleotides are distinguished by their bases, which can be a pyrimidine (cytosine or uracil/thymine) or a purine (adenine or guanine). DNA is a double-stranded nucleic acid, and RNA is generally single stranded, though the single strand is often folded back on itself to form double-stranded sections. Information in nucleic acids is encoded in the specific sequence of nucleotides that constitute a strand. (p. 77)

| Analytic Questions 14. Would you expect a solution of high salt to be able to denature

tamic acid. What do you expect the effect might be if the mutation were to have placed a leucine at that site? An aspartic acid? Of the following amino acids, glycine, isoleucine, and lysine, which would you expect to be the most soluble in an acidic aqueous solution? Which the least? How many structural isomers could be formed from a molecule with the formula C5H12? C4H8? Glyceraldehyde is the only three-carbon aldotetrose, and it can exist as two stereoisomers. What is the structure of dihydroxyacetone, the only ketotriose? How many stereoisomers does it form? Bacteria are known to change the kinds of fatty acids they produce as the temperature of their environment changes. What types of changes in fatty acids would you expect as the temperature drops? Why would this be adaptive? In the polypeptide backbone —C—C—N—C—C—N—C— C—NH2, identify the -carbons. Which of the following are true? Increasing the pH of a solution would (1) suppress the dissociation of a carboxylic acid, (2) increase the charge on an amino group, (3) increase the dissociation of a carboxylic acid, (4) suppress the charge on an amino group. Which of the four classes of amino acids has side chains with the greatest hydrogen-bond-forming potential? Which has the greatest potential to form ionic bonds? Hydrophobic interactions? If the three enzymes of the pyruvate dehydrogenase complex existed as physically separate proteins rather than as a complex, what effect might this have on the rate of reactions catalyzed by these enzymes? Would you agree that neither ribonuclease nor myoglobin had quaternary structure? Why or why not? How many different tripeptides are possible? How many carboxyl terminals of polypeptide chains are present in a molecule of hemoglobin? You have isolated a pentapeptide composed of four glycine residues and one lysine residue that resides at the C-terminus of the peptide. Using the information provided in the legend of Figure 2.27, if the pK of the side chain of lysine is 10 and the pK of the terminal carboxyl group is 4, what is the structure of the peptide at pH 7? At pH 12? The side chains of glutamic acid (pK 4.3) and arginine (pK 12.5) can form an ionic bond under certain conditions. Draw the relevant portions of the side chains and indicate whether or not an ionic bond could form at the following: (a) pH 4; (b) pH 7; (c) pH 12; (d) pH 13.

ribonuclease? Why or why not? You have read in the Human Perspective that (1) mutations in the PRNP gene can make a polypeptide more likely to fold into the PrPSc conformation, thus causing CJD and (2) exposure to the PrPSc prion can lead to an infection that also causes CJD. How can you explain the occurrence of rare sporadic cases of the disease in persons who have no genetic propensity for it? Persons who are born with Down syndrome have an extra (third) copy of chromosome #21 in their cells. Chromosome #21 contains the gene that encodes the APP protein. Why do you suppose that individuals with Down syndrome typically develop Alzheimer’s disease at an early age? We saw on page 76 how evolution has led to the existence of protein families composed of related molecules with similar functions. A few examples are also known where proteins with very similar functions have primary and tertiary structures that show no evidence of evolutionary relationship. Subtilisin and trypsin, for example, are two protein-digesting enzymes (proteases) that show no evidence they are homologous despite the fact that they utilize the same mechanism for attacking their substrates. How can this coincidence be explained? Would you agree with the statement that many different amino acid sequences can fold into the same basic tertiary structure? What data can you cite as evidence for your position. In the words of one scientist: “The first question any structural biologist asks upon being told that a new [protein] structure has been solved is no longer ‘What does it look like?; it is now What does it look like?’ ” What do you suppose he meant by this statement? It was noted in the Human Perspective that persons with arthritis who had taken certain NSAIDs over a long period of time exhibited a lower incidence of Alzheimer’s disease, yet doubleblinded clinical trials on these same drugs did not appear to benefit patients with AD. These would appear to be contradictory findings. The first type of study is referred to as a retrospective study in that researchers look backwards from a correlation that is made at the present time, in this case a conclusion that taking NSAIDs over a period of time may prevent the development of AD. The second type of study is referred to as a prospective study in that it looks forward to future results based on an experimental plan to give some patients a drug and others a placebo. Can you give a reason why these two different approaches might lead to differing conclusions about the use of these drugs?

2.

3. 4.

5.

6. 7.

8.

9.

10. 11.

12.

13.

15.

16.

17.

18.

19.

20.

Analytic Questions

1. Sickle cell anemia results from a substitution of a valine for a glu-

86

3 Bioenergetics, Enzymes, and Metabolism 3.1 Bioenergetics 3.2 Enzymes as Biological Catalysts 3.3 Metabolism THE HUMAN PERSPECTIVE: The Growing Problem of Antibiotic Resistance

T

he interrelationship between structure and function is evident at all levels of biological organization from the molecular to the organismal. We saw in the last chapter that proteins have an intricate three-dimensional structure that depends on particular amino acid residues being present in precisely the correct place. In this chapter, we will look more closely at one large group of proteins, the enzymes, and see how their complex architecture endows them with the capability to vastly increase the rate of biological reactions. To understand how enzymes are able to accomplish such feats, it is necessary to consider the flow of energy during a chemical reaction, which brings us to the subject of thermodynamics. A brief survey of the principles of thermodynamics also helps explain many of the cellular processes that will be discussed in this and following chapters, including the movement of ions across membranes, the synthesis of macromolecules, and the assembly of cytoskeletal networks. As we will see, the thermodynamic analysis of a particular system can reveal whether or not the events can occur spontaneously and, if not, provide a measure of the energy a cell must expend for the process to be accomplished. In the final section of this chapter, we will see how individual chemical reactions are linked together to form metabolic pathways and how the flow of energy and raw materials through certain pathways can be controlled.

A model showing the surface of the enzyme ⌬5-3-ketosteroid isomerase with a substrate molecule (green) in the active site. The electrostatic character of the surface is indicated by color (red, acidic; blue, basic). (FROM ZHENG RONG WU ET AL., SCIENCE 276:417, 1997, COURTESY OF MICHAEL F. SUMMERS, UNIVERSITY OF MARYLAND, BALTIMORE COUNTY; © 1997 REPRINTED WITH PERMISSION FROM AAAS.)

87

3.1 | Bioenergetics A living cell bustles with activity. Macromolecules of all types are assembled from raw materials, waste products are produced and excreted, genetic instructions flow from the nucleus to the cytoplasm, vesicles are moved along the secretory pathway, ions are pumped across cell membranes, and so forth. To maintain such a high level of activity, a cell must acquire and expend energy. The study of the various types of energy transformations that occur in living organisms is referred to as bioenergetics.

The Laws of Thermodynamics and the Concept of Entropy Energy is defined as the capacity to do work, that is, the capacity to change or move something. Thermodynamics is the study of the changes in energy that accompany events in the universe. In the following pages, we will focus on a set of concepts that allow us to predict the direction that events will take and whether or not an input of energy is required to cause the event to happen. However, thermodynamic measurements provide no help in determining how rapidly a specific process will occur or the mechanism used by the cell to carry out the process. The First Law of Thermodynamics The first law of thermodynamics is the law of conservation of energy. It states that energy can neither be created nor destroyed. Energy can, however, be converted (transduced) from one form to another. The transduction of electric energy to mechanical energy occurs when we plug in a clock (Figure 3.1a), and chemical energy is converted to thermal energy when fuel is burned in an oil heater. Cells are also capable of energy transduction. As discussed in later chapters, the chemical energy stored in certain biological molecules, such as ATP, is converted to me-

(b)

Figure 3.1 Examples of energy transduction. (a) Conversion of electrical energy to mechanical energy, (b) conversion of chemical

1

Several communities of organisms are known that are independent of photosynthesis. These include communities that reside in the hydrothermal vents at the bottom of the ocean floor that depend on energy obtained by bacterial chemosynthesis.

(c)

energy to mechanical and thermal energy, (c) conversion of chemical energy to light energy.

3.1 Bioenergetics

(a)

chanical energy when organelles are moved from place to place in a cell, to electrical energy when ions flow across a membrane, or to thermal energy when heat is released during muscle contraction (Figure 3.1b). The most important energy transduction in the biological world is the conversion of sunlight into chemical energy—the process of photosynthesis— which provides the fuel that directly or indirectly powers the activities of nearly all forms of life.1 A number of animals, including fireflies and luminous fish, are able to convert chemical energy back into light (Figure 3.1c). Regardless of the transduction process, however, the total amount of energy in the universe remains constant. To discuss energy transformations involving matter, we need to divide the universe into two parts: the system under study and the remainder of the universe, which we will refer to as the surroundings. A system can be defined in various ways: it may be a certain space in the universe or a certain amount of matter. For example, the system may be a living cell. The changes in a system’s energy that occur during an event are manifested in two ways—as a change in the heat content of the system and in the performance of work. Even though the system may lose or gain energy, the first law of thermodynamics indicates that the loss or gain must be balanced by a corresponding gain or loss in the surroundings, so that the amount in the universe as a whole remains constant. The energy of the system is termed the internal energy (E), and its change during a transformation is ⌬E (delta E). One way to describe the first law of thermodynamics is that ⌬E ⫽ Q ⫺ W, where Q is the heat energy and W is the work energy. Depending on the process, the internal energy of the system at the end can be greater than, equal to, or less than its internal

88

Figure 3.2 A change in a system’s internal energy. In this example, the system will be defined as a particular leaf of a plant. (a) During the day, sunlight is absorbed by photosynthetic pigments in the leaf ’s chloroplasts and used to convert CO2 into carbohydrates, such as the glucose molecule shown in the drawing (which is subsequently incorporated into sucrose or starch). As the cell absorbs light, its internal energy increases; the energy present in the remainder of the universe has to decrease. (b) At night, the energy relationship between the cell and its surroundings is reversed as the carbohydrates produced during the day are oxidized to CO2 in the mitochondria and the energy is used to run the cell’s nocturnal activities.

CO2

6

ΔE >0

CH2OH

H 4

HO

3

(a)

The Second Law of Thermodynamics The second law of thermodynamics expresses the concept that events in the universe have direction; they tend to proceed “downhill” from a state of higher energy to a state of lower energy. Thus, in any energy transformation, there is a decreasing availability of energy for doing additional work. Rocks fall off cliffs to the ground below, and once at the bottom, their ability to do additional work is reduced; it is very unlikely that they will lift themselves back to the top of the cliff. Similarly, opposite charges normally move together, not apart, and heat flows from a warmer to a cooler body, not the reverse. Such events are said to be spontaneous, a term that indicates they are thermodynamically favorable and can occur without the input of external energy. The concept of the second law of thermodynamics was formulated originally for heat engines, and the law carried with it the idea that it is thermodynamically impossible to construct a perpetual-motion machine. In other words, it is impossible for a machine to be 100 percent efficient, which would be required if it were to continue functioning without the input of external energy. Some of the energy is inevitably lost as the machine carries out its activity. A similar relationship holds true for living organisms. For example, when a giraffe browses on the leaves of a tree or a lion preys on the

ΔE 2000 CGG Fragile X syndrome

Coding exon

Intron

6 _ 35 > 35

7 _ 22 23 _ 200 >200 GAA Friedreich's ataxia

mine. Thus the normal huntingtin polypeptide contains a stretch of 6 to 35 glutamine residues—a polyglutamine tract—as part of its primary structure. We think of polypeptides as having a highly defined primary structure, and most of them do, but huntingtin is normally polymorphic with respect to the length of its polyglutamine tract. The protein appears to function normally as long as the tract length remains below approximately 35 glutamine residues. But if this number is exceeded, the protein takes on new properties, and the person is predisposed to developing HD. HD exhibits a number of unusual characteristics. Unlike most inherited diseases, HD is a dominant genetic disorder, which means that a person with the mutant allele will develop the disease regardless of whether or not he or she has a normal HD allele. In fact, persons who are homozygous for the HD allele are no more seriously affected than heterozygotes. This observation indicates that the mutant huntingtin polypeptide causes the disease, not because it fails to carry out a particular function, but because it acquires some type of toxic property, which is referred to as a gain-of-function mutation. This interpretation is supported by studies with mice. Mice that are engineered to carry the mutant human HD allele (in addition to their own normal alleles) develop a neurodegenerative disease resembling that found in humans. The presence of the one abnormal allele is sufficient to cause disease. Another unusual characteristic of HD and the other CAG disorders is a phenomenon called genetic anticipation, which means that, as the disease is passed from generation to generation, its severity increases and/or it strikes at an increasingly earlier age. This was once a puzzling feature of HD but is now readily explained by the fact that the number of CAG repeats in a mutant allele (and the resulting consequences) often increases dramatically from one generation to the next. The molecular basis of HD remains unclear, but there is no shortage of theories as to why an expanded glutamine tract may be toxic to brain cells. One feature appears indisputable: when the polyglutamine tract of huntingtin exceeds about 35 residues, the protein (or a fragment cleaved from it) undergoes abnormal folding to produce a misfolded molecule that (1) binds to other mutant

Type 1 diseases CAG/polyglutamine e.g. Huntington's disease

Figure 1 Trinucleotide repeat sequences and human disease. The top line shows a generalized gene that is transcribed into a messenger RNA with several distinct portions, including a 5⬘ noncoding portion called the 5⬘ UTR (5⬘ untranslated region), a coding exon that carries the information for the amino acid sequence of the polypeptide, and a 3⬘ noncoding portion (the 3⬘ UTR). The introns in the DNA (see Figure 11.29) are not represented in the mature messenger RNA. The general location of the trinucleotide responsible for each of four different diseases (fragile X syndrome, Friedreich’s ataxia, Huntington’s

Intron

3' UTR 5 _ 40 45 _ 200 200 _ >2000 CTG Myotonic dystrophy

disease, and myotonic dystrophy) is indicated by the location of each pyramid. The number of repeats responsible for the normal (red), carrier (orange), and disease (yellow) conditions for each diseasecausing gene is indicated. Genes responsible for Type I diseases, such as Huntington’s, do not exhibit the intermediate “carrier” state in which an individual possesses an unstable allele but is not affected. (REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LTD: J-L MANDEL, NATURE 386:768, 1997; COPYRIGHT 1997.)

405 huntingtin molecules to form insoluble aggregates, not unlike those seen in the brains of Alzheimer’s victims (page 67), and (2) binds to a number of unrelated proteins that do not interact with normal, wild-type huntingtin molecules. Among the proteins bound by mutant huntingtin are several transcription factors, which are proteins involved in the regulation of gene expression. Several of the most important transcription factors present in cells, including TBP (see Figure 11.19) and CBP (see Figure 12.50), contain polyglutamine stretches themselves, which makes them particularly susceptible to aggregation by proteins with mutant, expanded polyglutamine tracts. In fact, the protein aggregates present in degenerating neurons of HD patients contain both of these transcription factors. These results suggest that mutant huntingtin sequesters transcription factors, disrupting the transcription of genes that are required for the health and survival of the affected neurons. This hypothesis has received support from a study in which mice were genetically engineered so that their brain cells lacked the ability to produce certain key transcription factors. These mice exhibited the same type of neurodegeneration as is seen in animals carrying a mutant HD gene. Other basic neuronal processes, including axonal transport, mitochondrial permeability and fission, cholesterol synthesis, and protein degradation, are also disrupted by the HD mutation and represent possible causes of nerve cell death. It should be noted that a number of HD researchers hold an alternate opinion about the cause of cell death. They argue that it is not the protein aggregates that are toxic, but the soluble mutant protein (or fragments of the mutant protein) itself. In fact, proponents of this view argue that the mutant protein aggregates protect the cell by sequestering the harmful molecules. It is important for practical reasons to distinguish between these possibilities because a number of proposed therapies are aimed at blocking formation of the aggregates, which could actually prove more harmful to a patient.

Figure 10.20 Chromosomal localization of a nonrepeated DNA sequence. These mitotic chromosomes were prepared from a dividing mouse cell and incubated with a purified preparation of biotin-labeled DNA encoding one of the nuclear lamin proteins (lamin B2), which is encoded by a nonrepeated gene. The locations of the bound, labeled DNA appear as bright dots. The lamin gene is present on the homologues of chromosome 10. Each chromosome contains two copies of the gene because the DNA had been replicated prior to the cells entering mitosis. (FROM MONIKA ZEWE ET AL., COURTESY OF WERNER FRANKE, EUR. J. CELL BIOL. 56:349, 1991, WITH PERMISSION FROM ELSEVIER.)

10.4 The Structure of the Genome

Nonrepeated DNA Sequences As initially predicted by Mendel, classical studies on the inheritance patterns of visible traits led geneticists to conclude that each gene was present in one copy per single (haploid) set of chromosomes. When denatured eukaryotic DNA is allowed to reanneal, a significant fraction of the fragments are very slow to find partners, so slow in fact that they are presumed to be present in a single copy per genome. This fraction comprises the nonrepeated (or single-copy) DNA sequences, which includes the genes that exhibit Mendelian patterns of inheritance. Because they are present in a single copy in the genome, nonrepeated sequences localize to a particular site on a particular chromosome (Figure 10.20). Considering the variety of different sequences present, the nonrepeated fraction contains by far the greatest amount of genetic information. Included within the nonrepeated fraction are the DNA sequences that code for virtually all proteins other than histones. Even though these sequences are not present in multiple copies, genes that code for polypeptides are usually members of a family of related genes. This is true for the globins, actins, myosins, collagens, tubulins, integrins, and most other proteins in a eukaryotic cell. Each member of a multigene family is encoded by a different but related sequence. We will look into the origin of these multigene families in the following section. Now that the human genome has been sequenced and analyzed, we finally have a relatively accurate measure of the

Type II trinucleotide repeat diseases differ from the Type I diseases in a number of ways. Type II diseases (1) arise from the expansion of a variety of trinucleotides, not only CAG, (2) the trinucleotides involved are present in a part of the gene that does not encode amino acids (Figure 1), (3) the trinucleotides are subject to massive expansion into thousands of repeats, and (4) the diseases affect numerous parts of the body, not only the brain. The best studied Type II disease is fragile X syndrome, so named because the mutant X chromosome is especially susceptible to damage. Fragile X syndrome is characterized by mental retardation as well as a number of physical abnormalities. The disease is caused by a dynamic mutation in a gene called FMR1 that encodes an RNA-binding protein that regulates the translation of certain mRNAs involved in neuronal development and/or synaptic function. A normal allele of this gene contains anywhere from about 5 to 55 copies of a specific trinucleotide (CGG) that is repeated in a part of the gene that corresponds to the 5⬘ noncoding portion of the messenger RNA (Figure 1). However, once the number of copies rises above about 60, the locus becomes very unstable and the copy number tends to increase rapidly into the thousands. Females with an FMR1 gene containing 60 to 200 copies of the triplet generally exhibit a normal phenotype but are carriers for the transmission of a highly unstable chromosome to their offspring. If the repeat number in the offspring rises above about 200, the individual is almost always mentally retarded. Unlike an abnormal HD allele, which causes disease as the result of a gain of function, an abnormal FMR1 allele causes disease as the result of a loss of function; FMR1 alleles containing an expanded CGG number are selectively inactivated so that the gene is not transcribed or translated. Although there is no effective treatment for any of the diseases caused by trinucleotide expansion, the risk of transmitting or possessing a mutant allele can be assessed through genetic screening.

406

DNA sequences that are responsible for encoding the amino acid sequences of our proteins, and it is remarkably small. If you would have suggested to a geneticist in 1960 that less than 1.5 percent of the human genome encodes the amino acids of our proteins, he or she would have considered the suggestion to be ridiculous. Yet, that is the reality that has emerged from the study of genome sequences. In the following section, we will try to better understand how the remaining 98⫹ percent of DNA sequences might have evolved.

REVIEW 1. What is a genome? How does the complexity of bacterial genomes differ from that of eukaryotic genomes? 2. What is meant by the term DNA denaturation? How does denaturation depend on the GC content of the DNA? How does this variable affect the Tm? 3. What is a microsatellite DNA sequence? What role do these sequences play in human disease? 4. Which fraction of the genome contains the most information? Why is this true?

10.5 | The Stability of the Genome Because DNA is the genetic material, we tend to think of it as a conservative molecule whose information content changes slowly over long periods of evolutionary time. In actual fact, the sequence organization of the genome is capable of rapid change, not only from one generation to the next, but within the lifetime of a given individual.

Chapter 10 The Nature of the Gene and the Genome

Whole-Genome Duplication (Polyploidization) As discussed in the first sections of this chapter, peas and fruit flies have pairs of homologous chromosomes in each of their cells. These cells are said to have a diploid number of chromosomes. If a person were to compare the number of chromosomes present in the cells of closely related organisms, especially higher plants, they would find that some species have a much greater number of chromosomes than a close relative. Among animals, the widely studied amphibian Xenopus laevis, for example, has twice the number of chromosomes as its cousin X. tropicalis. These types of discrepancies can be explained by a process known as polyploidization, or wholegenome duplication. Polyploidization is an event in which offspring are produced that have twice the number of chromosomes in each cell as their diploid parents; the offspring have four homologues of each chromosome rather than two. Polyploidization is thought to occur in either of two ways: two related species mate to form a hybrid organism that contains the combined chromosomes from both parents, or, alternatively, a single-celled embryo undergoes chromosome duplication but the duplicates, rather than being separated into separate cells, are retained in a single cell that develops into a viable embryo. The first mechanism occurs most often in

plants, and the second most often in animals. Polyploidization is particularly common in flowering plants, including numerous crop species (e.g., wheat, bananas, and coffee) depicted in Figure 10.21. When polyploidization occurs in a plant lineage, the chromosome number suddenly doubles and, in most cases, tends to return back toward the original diploid number over an ensuing period of evolution. As a result, different modern plant species are caught at various stages in the evolutionary process of winnowing their gene number. This is the reason that the genomes of different plants tend to have a much greater variation in numbers of genes than that exhibited by different animals (see Figure 10.27). A “sudden” doubling of the number of chromosomes is a dramatic event, one that gives an organism remarkable evolutionary potential—assuming it can survive the increased number of chromosomes and reproduce. Depending on the circumstances, polyploidization may result in the production of a new species that has a great deal of “extra” genetic information. Several different fates can befall extra copies of a gene; they can be lost by deletion, rendered inactive by deleterious mutations, or, most importantly, they can evolve into new genes that possess new functions. Viewed in this way, extra genetic information is the raw material for evolutionary diversification. In 1971, Susumu Ohno of the City of Hope Cancer Center in Los Angeles put forward the “2R” hypothesis, in which he proposed that the evolution of vertebrates from a much simpler invertebrate ancestor was made possible by two separate rounds of whole-genome duplication during an early evolutionary period. Ohno suggested that the thousands of extra genes that would be generated by genome duplication could be molded by natural selection into new genes that were required to encode the more complex vertebrate body. Over the past three and a half decades, Ohno’s proposal has been hotly debated as geneticists have tried to find evidence that either supports or refutes the notion.

Figure 10.21 A sampling of agricultural crops that are polyploid. Pictured are oil from oilseed rape, bread from bread wheat, rope from sisal, coffee beans, banana, cotton, potatoes, and maize. (FROM A. R. LEITCH AND I. J. LEITCH, SCIENCE 320:481, 2008; © 2008, REPRINTED WITH PERMISSION FROM AAAS.)

407

has about a 1 percent chance of being duplicated every million years. Duplication of a gene can probably occur by several different mechanisms but is most often thought to be produced by a process of unequal crossing over, as depicted in Figure 10.22. Unequal crossing over occurs when a pair of homologous chromosomes comes together during meiosis in such a way that they are not perfectly aligned. As a result of misalignment, genetic exchange between the homologues causes one chromosome to acquire an extra segment of DNA (a duplication) and the other chromosome to lose a DNA segment (a deletion). If the duplication of a particular sequence is repeated in subsequent generations, a cluster of tandemly repeated segments is generated at a localized site within that chromosome (see Figure 10.28). The vast majority of gene duplicates are either lost during evolution through deletion or rendered nonfunctional by unfavorable mutation. However, in a small percentage of cases, the “extra” copy accumulates favorable mutations and acquires a new function. More often, both copies of the gene undergo mutation so that each evolves a more specialized function than that performed by the original gene. In either case, the two genes will have closely related sequences and encode similar polypeptides, which is to say that they encode different isoforms of a particular protein, such as ␣- and ␤-tubulin (page 330). Subsequent duplications of one of the genes can lead to the formation of additional isoforms (e.g., ␥-tubulin), and so forth. It becomes apparent from this example that successive gene duplication can generate families of genes that encode polypeptides with related amino acid sequences. The production of a multigene family is illustrated by the evolution of the globin genes.

The problem facing genome analysts is the vast stretch of time that has passed—hundreds of millions of years—since the origin of our earliest vertebrate ancestors. Just as a river or sea slowly wears away the face of the Earth, chromosomal rearrangement and mutation slowly wear away the face of an ancestral genome. Even with the complete sequence of a number of invertebrate and vertebrate genomes in hand, it has still proven a formidable challenge to identify the origin of many of our genes. The strongest evidence for the 2R hypothesis comes from analysis of the amphioxus genome. Amphioxus lacks a backbone, which makes it an invertebrate, but it has a number of features (e.g., a notochord, dorsal tubular nerve cord, and segmental body musculature) that clearly identifies it as a member of the phylum Chordata to which vertebrates belong. The lineages leading to modern vertebrates and amphioxus are thought to have separated about 550 million years ago, yet the two groups share a remarkably similar collection of genes. However, when researchers looked more closely at certain groups of genes, the genomes of vertebrates typically contained four times the number of such genes when compared to the homologous sequences in the amphioxus genome. This finding provides strong support for Ohno’s hypothesis of two rounds of whole-genome duplication in the ancestral vertebrate lineage.

Duplication and Modification of DNA Sequences Polyploidization is an extreme case of genome duplication and occurs only rarely during evolution. In contrast, gene duplication, which refers to the duplication of a small portion of a single chromosome, happens with surprisingly high frequency, and its occurrence is readily documented by genome analysis.3 According to one estimate, each gene in the genome

Evolution of Globin Genes Hemoglobin is a tetramer composed of four globin polypeptides (see Figure 2.38b). Examination of globin genes, whether from a mammal or a fish, reveals a characteristic organization. Each of these genes is constructed of three exons and two introns. Exons are parts of genes that code for amino acids in the encoded polypeptide,

3

Actually, three categories of duplication can be distinguished: whole-genome, gene, and segmental duplication. The last category, which refers to the duplication of a large block of chromosomal material (from a few kilobases to hundreds of kilobases in length) is not discussed here, but has a significant impact in genome evolution. Approximately 5 percent of the present human genome consists of segmental duplications that have arisen during the past 35 million years.

1

2

1

2

2

Unequal crossover

1

2

2

(a)

Figure 10.22 Unequal crossing over between duplicated genes provides a mechanism for generating changes in gene number. (a) The initial state shown has two related genes (1 and 2). In a diploid individual, gene 1 on one homologue can align with gene 2 on the other homologue during meiosis. If a crossover occurs during this

1

2

1

2

2

2

(b)

misalignment, half the gametes will be missing gene 2 and half will have an extra gene 2. (b) As unequal crossing over continues to occur during meiotic divisions in subsequent generations, a tandemly repeated array of DNA sequences will gradually evolve.

10.5 The Stability of the Genome

1

408

whereas introns do not; they are noncoding intervening sequences. The subject of exons and introns is discussed in detail in Section 11.4 (see Figure 11.24). For the present purposes, we will simply use these gene parts as landmarks of evolution. Examination of genes encoding certain globin-like polypeptides, such as the plant protein leghemoglobin and the muscle protein myoglobin, reveals the presence of four exons and three introns. This is proposed to represent the ancestral

Evolutionary time β–Globin gene family (chromosome 11) β Adult

β

δ Adult ψβ1 Pseudogene

β

Aγ Fetus

γ

Gγ Fetus

Ancestral globin gene 1

ε 2

3

ε

Embryo

β 4

5

6

α

Modern globin gene

7

α–Globin gene family (chromosome 16)

α

α

α1 Fetus & adult α2 Fetus & adult ψα1 Pseudogene ψζ Pseudogene Embryo

Duplications and divergence

Separation of genes

Divergence by mutation

Duplication

ζ

Exon fusion

Chapter 10 The Nature of the Gene and the Genome

ζ

Figure 10.23 A pathway for the evolution of globin genes. Exons are shown in green, introns in brown. The evolutionary steps depicted in the diagram are discussed in the text. The arrangement of ␣- and ␤-globin genes on human chromosomes 16 and 11 (shown in blue without their introns in step 7) are the products of several hundred million years of evolution. As discussed in Chapter 2, hemoglobin molecules consist of two pairs of polypeptide chains—one pair is always a member of the ␣-globin subfamily, and the other pair is always a member of the ␤-globin subfamily. Specific combinations of ␣- and ␤-globins are found at different stages of development. The ␣- and ␤-globin chains that are observed in embryonic, fetal, and adult hemoglobins are indicated.

form of the globin gene. It is thought that the modern globin polypeptide arose from the ancestral form as the result of the fusion of two of the globin exons (step 1, Figure 10.23) some 800 million years ago. A number of primitive fish are known that have only one globin gene (step 2), suggesting that these fish diverged from other vertebrates prior to the first duplication of the globin gene (step 3). Following this duplication approximately 500 million years ago, the two copies diverged by mutation (step 4) to form two distinct globin types, an ␣ type and a ␤ type, located on a single chromosome. This is the present arrangement in the amphibian Xenopus and in zebrafish. In subsequent steps, the ␣ and ␤ forms are thought to have become separated from one another by a process of rearrangement that moved them to separate chromosomes (step 5). Each gene then underwent subsequent duplications and divergence (step 6), generating the arrangement of globin genes that exists today in humans (step 7). The evolution of vertebrate globin genes illustrates how gene duplication typically leads to the generation of a family of genes whose individual members have specialized functions (in this case, embryonic, fetal, and adult forms) when compared to the single founding gene. When the DNA sequences of globin gene clusters were analyzed, researchers found “genes” whose sequences are homologous to those of functional globin genes, but which have accumulated severe mutations that render them nonfunctional. Genes of this type, which are evolutionary relics, are known as pseudogenes. Examples of pseudogenes are found in both the human ␣- and ␤-globin gene clusters of Figure 10.23. The human genome contains an estimated 19,000 pseudogenes. Although pseudogenes do not encode functional proteins, they can be transcribed into RNAs, which may have regulatory functions. Another point that is evident from examination of the two globin clusters of human chromosomes is how much of the DNA consists of noncoding sequences, either as introns within genes or as spacers between genes. In fact, the globin regions contain a much higher fraction of coding sequences than most other regions of the genome.

“Jumping Genes” and the Dynamic Nature of the Genome If one looks at repeated sequences that have arisen during the normal course of evolution, one finds that repeats are sometimes present in tandem arrays, sometimes present on two or a few chromosomes (as in the case of the globin genes of Figure 10.23), and sometimes dispersed throughout the genome. If we assume that all members of a family of repeated sequences arose from a single copy, then how can individual members become dispersed among different chromosomes? The first person to suggest that genetic elements were capable of moving around the genome was Barbara McClintock, a geneticist working with maize (corn) at the Cold Spring Harbor Laboratories in New York. Genetic traits in maize are often revealed as changes in the patterns and markings in leaf and kernel coloration (Figure 10.24). In the late 1940s, McClintock found that certain mutations were very unstable, ap-

409 Donor DNA

Transposon DNA

Donor DNA

Transposase binding

1

2

3

Cleavage

+ Figure 10.24 Visible manifestations of transposition in maize. Kernels of corn are typically uniform in color. The spots on this kernel result from a mutation in a gene that codes for an enzyme involved in pigment production. Mutations of this type can be very unstable, arising or disappearing during the period in which a single kernel develops. These unstable mutations appear and disappear as the result of the movement of transposable elements into and out of these genes during the period of development. (COURTESY OF VENKATESAN SUNDARESAN, COLD SPRING HARBOR LABORATORY.)

4

Target capture

5

Integration

Figure 10.25 Transposition of a bacterial transposon by a “cutand-paste” mechanism. As discussed in the text, the two ends of this bacterial Tn5 transposon are flanked by repeated sequences (orange segments). The two ends are brought together by the dimerization of a pair of subunits of the transposase (orange spheres). Both strands of the double helix are cleaved at each end, which excises the transposon as part of a complex with the transposase. The transposon–transposase complex is “captured” by a target DNA, and the transposon is inserted in such a way as to produce a small duplication that flanks the transposed element. (Note: Not all DNA transposons move by this mechanism.) (FROM D. R. DAVIES ET AL., SCIENCE 289:77, 2000; COPYRIGHT 2000, REPRINTED WITH PERMISSION FROM AAAS.)

As originally demonstrated by McClintock, eukaryotic genomes contain large numbers of transposable elements. In fact, at least 45 percent of the DNA in a human cell nucleus has been derived from transposable elements! The vast majority (⬎99 percent) of transposable elements are incapable of moving from one place to another; they have either been crippled by mutation or their movement is suppressed by the cell. (Suppression occurs by means of small cellular RNAs [Section 11.5] and DNA methylation [Section 12.4]). However, when transposable elements do change position, they insert widely throughout the target DNA. In fact, many transposable elements can insert themselves within the center of a proteincoding gene. Several examples of such an occurrence have been documented in humans, including a number of cases of hemophilia caused by a mobile genetic element that had “jumped” into the middle of one of the key blood clotting genes. It is estimated that approximately 1 out of 500 diseasecausing mutations in humans is the result of the insertion of a transposable element. In addition, the reactivation of transposable elements may contribute to the development of certain cancers.

10.5 The Stability of the Genome

pearing and disappearing from one generation to the next or even during the lifetime of an individual plant. After several years of careful study, she concluded that certain genetic elements were moving from one place in a chromosome to an entirely different site. She called this genetic rearrangement transposition, and the mobile genetic elements transposable elements. Meanwhile, molecular biologists working with bacteria were finding no evidence of “jumping genes.” In their studies, genes appeared as stable elements situated in a linear array on the chromosome that remained constant from one individual to another and from one generation to the next. McClintock’s findings were largely ignored. Then, in the late 1960s, several laboratories discovered that certain DNA sequences in bacteria moved on rare occasion from one place in the genome to another. These bacterial transposable elements were called transposons. Most transposons encode a protein, or transposase, that single-handedly catalyzes the excision of a transposon from a donor DNA site and its subsequent insertion at a target DNA site. This “cutand-paste” mechanism is mediated by two separate transposase subunits that bind to inverted, repeated sequences at the two ends of the transposon (Figure 10.25, step 1). The two subunits then come together to form an active dimer (step 2) that catalyzes a series of reactions leading to the excision of the transposon (step 3). The transposase–transposon complex then binds to a target DNA (step 4) where the transposase catalyzes the reactions required to integrate the transposon into its new residence (step 5). Integration of the element typically creates a small duplication in the target DNA that flanks the transposed element at the site of insertion. Target site duplications serve as “footprints” to identify sites in the genome that are occupied by transposable elements.

Target DNA

410

Figure 10.26 illustrates two major types of eukaryotic transposable elements, DNA transposons and retrotransposons, and their differing mechanism of transposition. As described above for prokaryotes, most eukaryotic DNA transposons are excised from the DNA at the donor site and inserted at a distant target site (Figure 10.26a). This “cut-andpaste” mechanism is utilized, for example, by members of the mariner family of transposons, which are found throughout the plant and animal kingdoms. Retrotransposons, in contrast, operate by means of a “copy-and-paste” mechanism that involves an RNA intermediate (Figure 10.26b). The DNA of the transposable element is transcribed, producing an RNA, which is then “reverse transcribed” by an enzyme called reverse transcriptase, producing a complementary DNA. The DNA copy is made double-stranded and then integrated into a target DNA site. In most cases, the retrotransposon itself contains the sequence that codes for a reverse transcriptase. Retroviruses, such as the virus responsible for AIDS, use a very similar mechanism to replicate their RNA genome and integrate a DNA copy into a host chromosome. The Role of Mobile Genetic Elements in Genome Evolution It was noted on page 403 that moderately repeated DNA sequences constitute a significant portion of eu-

Donor DNA with transposon

karyotic genomes. Unlike the highly repeated fraction of the genome (satellite, minisatellite, and microsatellite DNA), whose sequences reside in tandem and arise by DNA duplication, most of the moderately repeated sequences of the genome are interspersed and arise by transposition of mobile genetic elements. In fact, the two most common families of moderately repeated sequences in human DNA—the Alu and L1 families—are retrotransposons. Recall from page 403 that there are two classes of interspersed elements, SINEs and LINEs. Alu is an example of the former and L1 is an example of the latter. A full-length, transposable L1 sequence (at least 6000 base pairs in length) encodes a unique protein with two catalytic activities: a reverse transcriptase that makes a DNA copy of the RNA that encoded it and an endonuclease that nicks the target DNA prior to insertion. The human genome is estimated to contain about 500,000 copies of L1, but the vast majority of these are incomplete, immobile elements. Even still, L1 mobility continues to affect human evolution. In one study, for example, that compared the DNA sequences of 25 different people, any two individuals in the group differed in the presence or absence of an L1 element, on average, at 285 sites in their respective genomes. Even more abundant than L1 are the Alu sequences, which are interspersed at more than one million different sites

Recipient DNA

Recipient DNA with transposon

+

Excision of transposon

+ Donor DNA after loss of transposon and rejoining of ends (a)

Chapter 10 The Nature of the Gene and the Genome

Donor DNA with retrotransposon

RNA

Transcription by RNA polymerase

cDNA

Reverse transcription to form single-stranded cDNA

DNA

Conversion to double-stranded DNA

Recipient DNA

Recipient DNA with retrotransposon

+ Donor DNA with retrotransposon

(b)

Figure 10.26 Schematic pathways in the movement of transposable elements. (a) DNA transposons move by a cut-and-paste pathway, whose mechanism is depicted in Figure 10.25. Approximately 3 percent of the human genome consists of DNA transposons, none of which are capable of transposition (i.e., all are relics left in the genome as a result of ancestral activity). (b) Retrotransposons move by a

copy-and-paste pathway. The steps involved in retrotransposition take place both in the nucleus and cytoplasm and require numerous proteins, including those of the host. More than 40 percent of the human genome consists of retrotransposons, only a few of which (e.g., 40–100) are thought to be capable of transposition. More than one mechanism of retrotransposition is known.

411

1. Transposable elements can, on occasion, carry adjacent

parts of the host genome with them as they move from one site to another. Theoretically, two unlinked segments of the host genome could be brought together to form a new, composite segment. This may be a primary mechanism in the evolution of proteins that are composed of domains derived from different ancestral genes (as in Figure 2.36). 2. DNA sequences that were originally derived from transposable elements are found as parts of eukaryotic genes as well as parts of the DNA segments that regulate gene expression. For example, several transcription factors, which are the proteins that regulate gene expression, bind to sites in the DNA that arose originally from transposable elements. Even when there is no direct evidence of a function, many transposable elements are very similar in position and sequence to elements in the genomes of distant vertebrate relatives. This type of evolutionary conser-

vation suggests that these sequences carry out some beneficial role in the lives of their hosts (page 413). 3. In some cases, transposable elements themselves appear to have given rise to genes. The enzyme telomerase, which plays a key role in replicating the DNA at the ends of chromosomes (see Figure 12.24c), may be derived from a reverse transcriptase encoded by an ancient retrotransposon. The enzymes involved in the rearrangement of antibody genes (see Figure 17.18) are thought to be derived from a transposase encoded by an ancient DNA transposon. If this is in fact the case, our ability to ward off infectious diseases is a direct consequence of transposition. 4. A number of recent studies have found evidence that mammalian brain cells have a greatly elevated level of L1 retrotransposition compared to that of other tissues. It is hypothesized that these mobile elements, which insert at different sites in the genomes of different nerve cells, contribute to functional differences in the activities of brain cells. One point is clear: transposition has had a profound impact on the genetic composition of organisms. It is interesting to note that, only a few decades ago, molecular biologists considered the genome to be a stable repository of genetic information. Now it seems remarkable that organisms can maintain themselves from one day to the next in the face of this large-scale disruption by genetic rearrangement. In hindsight, it is not surprising that transposition was first discovered in plants, because transposable elements tend to be much more active in plants than they are in other eukaryotes. For her discovery of transposition, Barbara McClintock was the sole recipient of a Nobel Prize in 1983, at age 81, approximately 35 years after her initial report.

REVIEW 1. Describe the course of evolutionary events that are thought to give rise to multiple-gene families, such as those that encode the globins. How could these events give rise to pseudogenes? How could they give rise to proteins having entirely different functions? 2. Describe two mechanisms by which genetic elements are able to move from one site in the genome to another. 3. Describe the impact that transposable elements have had on the structure of the human genome over the past 50 million years.

10.6 | Sequencing Genomes: The Footprints of Biological Evolution Determining the nucleotide sequence of all the DNA in a genome is a formidable task. During the 1980s and 1990s, the technology to accomplish this effort gradually improved, as researchers developed new vectors to clone large segments of DNA and increasingly automated procedures to determine the nucleotide sequences of these large fragments (discussed in

10.6 Sequencing Genomes: The Footprints of Biological Evolution

throughout the human genome. Alu is a family of short, related sequences about 300 base pairs in length. The Alu sequence closely resembles that of the small RNA present in the signal recognition particles found in conjunction with membranebound ribosomes (page 283). It is presumed that during the course of evolution, this cytoplasmic RNA was copied into a DNA sequence by reverse transcriptase and integrated into the genome. The tremendous amplification of the Alu sequence is thought to have occurred by retrotransposition using the reverse transcriptase and endonuclease encoded by L1 sequences. Given its prevalence in the human genome, one might expect the Alu sequence to be repeated in genomes throughout the rest of the animal kingdom, but this is not the case. Comparative genomic studies indicate that the Alu sequence first appeared as a transposable element in the genome of primates about 60 million years ago and has been increasing in copy number ever since. The rate of Alu transposition has slowed dramatically over the course of primate evolution to its current estimated rate in humans of approximately once in every 200 births. These transposition events generate differences in the locations of Alu sequences from one person to another and thus contribute to the genetic diversity in the human population (page 416). When something new like a transposable element is discovered in an organism, a biologist’s first question is usually: What is its function? Many researchers who study transposition believe that transposable elements are primarily “junk.” According to this view, a transposable element is a type of genetic parasite that can invade a host genome from the outside world, spread within that genome, and be transmitted to offspring—so long as it doesn’t have serious adverse effects on the ability of the host to survive and reproduce. Even if this is the case, it doesn’t mean that transposable elements cannot make positive contributions to eukaryotic genomes. Keep in mind that evolution is an opportunistic process—there is no preset path to be followed. Regardless of its origin, once a DNA sequence is present in a genome, it has the potential to be “put to use” in some beneficial manner during the course of evolution. For this reason, the portion of the genome formed by transposable elements has been referred to as a “genetic scrap yard.” There are several ways that transposable elements appear to have been involved in adaptive evolution:

412

Section 18.14). The first complete sequence of a prokaryotic organism was reported in 1995, and the first complete sequence of a eukaryote, the budding yeast S. cerevisiae, was published the following year. Over the next few years—as the scientific community awaited the results of work on the human genome—the genomic sequences of numerous prokaryotic and eukaryotic organisms (including a fruit fly, nematode, and flowering plant) were reported. Researchers were able to sequence these genomes with relative speed because they are considerably smaller than the human genome, which contains approximately 3.2 billion base pairs. To put this number into perspective, if each base pair in the DNA were equivalent to a single letter on this page, the information contained in the human genome would produce a book approximately 1 million pages long. By 2001, the rough draft of the nucleotide sequence of the human genome had been published. The sequence was deEstimated number of genes encoding proteins 0

10, 000

20, 000

30, 000

60, 000

Baker's yeast Chlamydomonas Sponge Sea anemone Fruit fly Nematode Puffer fish Salamander

(??) Chicken Mouse

Human

Chapter 10 The Nature of the Gene and the Genome

Mustard Maize Apple 0

1, 000

2, 000

3, 000

90, 000

Genome size (millions of base pairs, Mbp)

Figure 10.27 Genome comparisons. Among eukaryotes whose genomes have been sequenced, the number of protein-coding genes (blue bars) varies from about 6,200 in yeast to 57,000 in apples, with vertebrates thought to possess about 20,000. (The high gene number in an apple reflects a relatively recent whole-genome duplication.) It is particularly interesting to note that apparent increases in the complexity of organisms are not reflected in dramatic increases in gene numbers. For example, neither (1) the transition from single-celled eukaryotes, such as Clamydomonas, to the simplest multicellular animals, sponges, nor (2) the transition from invertebrates to vertebrates, is accompanied by major changes in the number of protein-coding genes. Whereas the number of estimated protein-coding genes varies over a modest range among eukaryotes, the amount of DNA in a genome (red bars) varies widely, reaching values of 90 billion base pairs in some salamanders (the actual gene number for these amphibians is unknown).

scribed as “rough” because each segment was sequenced an average of about four times, which is not enough to ensure complete accuracy, and many regions that proved difficult to sequence were excluded. The first attempts to annotate the human genomic sequence, that is, to interpret the sequence in terms of the numbers and types of genes it encoded, led to a striking observation concerning gene number. Researchers concluded that the human genome probably contained in the neighborhood of 30,000 protein-coding genes. Up until it had been sequenced, it had been widely assumed that the human genome contained at least 50,000 and maybe as many as 150,000 different genes. The “finished” version of the human genome sequence was reported in 2004, which meant that (1) each site had been sequenced 7–10 times to ensure a very high degree of accuracy (at least 99.99 percent), and (2) the sequence contained a minimal number of gaps. The gaps that persisted contain regions of the chromosomes—often referred to as “dark matter”—that consist largely of long stretches of highly repeated DNA, especially those in and around the centromeres of each chromosome. Despite exhaustive efforts, these regions have proven impossible to clone, or their sequences have proven impossible to order correctly using current technology. Despite this remarkable achievement in nucleotide sequencing, the actual number of protein-coding genes in the human genome remains uncertain. Identification of genes using various computer software programs (algorithms) has been plagued with difficulties, and, in fact, the earlier estimate of 30,000 protein-coding human genes has been revised steadily downward. Although it has come as a shock to most biologists, current estimates place the number in the neighborhood of 21,000! This means that humans have roughly the same number of protein-coding genes as a microscopic worm, whose entire body—nervous system and all—consists of approximately 1000 cells (Figure 10.27).4 It is evident from this data that it is not possible, as was once thought, to understand the nature of an organism by simply knowing a list of the genes that make up that organism’s genome. If the differences in complexity between organisms cannot be explained by the number of protein-coding genes in their genomes, how can they be explained? We really don’t have a very good answer to that question, but we can list a few possibilities to consider. 1. As will be described in Chapter 12, a single gene can en-

code a number of related proteins as a result of a process termed alternative splicing (see Section 12.5). Several recent studies suggest that over 90 percent of human genes might engage in alternative splicing, so that the actual number of proteins encoded by the human genome is at least several 4

It can also be seen from Figure 10.27 that there is very little correlation between the number of protein-coding genes and the total amount of DNA in the genome. The puffer fish, for example, which has about the same number of genes as other vertebrates, has a genome that is about one-eighth the size of its human counterpart. The ancestors of puffer fish, like most bony fish, are thought to have had typical vertebrate-sized genomes, which demonstrates that “excess” DNA can be lost in a lineage over evolutionary time. At the other end of the spectrum from the puffer fish, the genome of certain salamanders is roughly 30 times the size of the human genome. The contrasts in genome size among vertebrates reflects a striking difference in the content of noncoding, largely repetitive DNA. The evolutionary significance of these differences is unclear.

413

Numerous other factors could be added to this discussion, but the general point is clear: The apparent difference in complexity between different groups of multicellular organisms is less a matter of the amount of genetic information in an organism’s genome than that of the manner in which that information is put to use.

Comparative Genomics: “If It’s Conserved, It Must Be Important” Consider the following facts: (1) The majority of the genome consists of DNA that resides between the genes, and thus represents intergenic DNA, and (2) each of the roughly 21,000 or so protein-coding human genes consists largely of noncoding portions (intronic DNA). Taken together, these facts indicate that the protein-coding portion of the genome represents a remark-

ably small percentage of the total DNA (estimated at about 1.5 percent). Most of the intergenic and intronic DNA of the genome does not contribute to the survival and reproductive capability of an individual, so there is no selective pressure to maintain its sequence unchanged. As a result, most intergenic and intronic sequences tend to change rapidly as organisms evolve. In other words, these sequences tend not to be conserved. In contrast, those segments of the genome that encode protein sequences or contain regulatory sequences that control gene expression (see Figure 12.45) are subject to natural selection. Natural selection tends to eliminate individuals whose genome contains mutations in these functional elements.5 As a result, these sequences tend to be conserved. It follows from these comments that the best way to identify functional sequences is to compare the genomes of different types of organisms. Despite the fact that humans and mice have not shared a common ancestor for about 75 million years, the two species share similar genes, which tend to be clustered in a remarkably similar pattern. For example, the number and order of human globin genes depicted in Figure 10.23 is basically similar to that found on a comparable segment of DNA in the mouse genome. As a result, it becomes possible to align corresponding regions of the mouse and human genomes. Human chromosome number 12, for example, is composed of a series of segments, in which each segment corresponds roughly to a block of DNA from a mouse chromosome. The number of the mouse chromosome containing each block is indicated. 5

10 15 Human chromosome 12

6

Even a quick examination of this drawing of a human chromosome reveals the dramatic changes in chromosome structure that occur as evolution generates new species over time. Blocks of genes that are present on the same chromosome in one species can become separated into different chromosomes in a subsequent species. Over time, chromosome numbers can increase or decrease as chromosomes split apart or fuse together. One estimate suggests that approximately 180 breakage and fusion events have taken place in the mouse and human lineages since the time these two present-day mammals last shared a common ancestor. Taken by themselves, changes in the positions of genes have very little effect on the phenotypes of the organisms, but they provide a clear visual footprint of the evolutionary process. When homologous segments of the mouse and human genome are aligned on the basis of their nucleotide sequences, it is found that approximately 5 percent of DNA sequences are highly conserved between the two species. This is a considerably higher percentage than would have been expected by combining 5

One can recognize two opposing sides of natural selection. Negative or purifying selection acts to maintain (conserve) sequences with important functions as they are because changes reduce the likelihood that the individual will survive and reproduce. These regions of the genome evolve more slowly than nonfunctional sequences whose change has no effect on the fitness of the individual and are not subject to natural selection (such sequences are said to evolve neutrally). In contrast, positive or Darwinian selection promotes speciation and evolution by selecting for changes in sequence that cause individuals to be better adapted to their environment and thus more likely to survive and reproduce. These regions evolve more rapidly than nonfunctional segments.

10.6 Sequencing Genomes: The Footprints of Biological Evolution

times greater than the number of genes it contains. It is likely that greater differences between organisms will emerge when this and other “gene-enhancing” mechanisms are explored further. This expectation is consistent with the observations that vertebrate genes tend to be more complex (i.e., contain more exons) than those of flies and worms and exhibit a higher incidence of alternative splicing. 2. Molecular biologists have devoted an enormous effort to studying the mechanisms that regulate gene expression. Despite this effort, our understanding of these mechanisms is very limited. We have learned, for example, that a large portion of the genome is transcribed into a bewildering array of RNAs, but we have very little information as to what most of these RNAs are doing within the cell. There is a growing belief that many of these RNAs have a gene regulatory function. There is also growing evidence that the number and complexity of these noncoding RNAs can be correlated with the levels of complexity of diverse organisms. In one study, for example, researchers sought to identify the number of microRNAs produced by various organisms. As discussed in the next chapter, microRNAs are one of the best studied regulatory RNAs. Researchers found that sponges expressed about 10 different microRNAs, and sea anemones approximately 40. This compares to approximately 150 identified in worms and fruit flies and at least one thousand in humans. This is not to imply that the number of microRNAs is the primary determinant of morphological complexity. Rather, it suggests that we will have to learn a great deal more about gene regulation if we are going to understand the basis of biological diversity. 3. Over the past decade or so, a new area of biological study (called systems biology) has emerged that focuses on the ways that proteins work together as complex networks rather than as individual actors. A very simple example of a protein network was presented in Figure 2.41. Given that cells produce thousands of different proteins, with varying degrees of interaction, these networks can become very complex and dynamic. A relatively small increase in the number of elements that make up a network, or an increase in the size and complexity of those elements, could dramatically increase the complexity of the network and thus the complexity of the entire organism.

414

the known protein-coding regions and gene regulatory regions (together accounting for roughly 2 percent of the genome). If we adhere to one of the foremost principles of molecular evolution: “if it’s conserved, it must be important,” then these studies are telling us that parts of the genome that were presumed to consist of “useless” sequences actually have important, unidentified functions. Some of these regions undoubtedly encode RNAs that have various regulatory functions (discussed in Section 11.5). Others likely have “chromosomal functions” rather than “genetic functions.” For example, these conserved sequences could be important for chromosome pairing prior to cell division. Whatever the function, these genomic elements are often located at great distances from the closest gene and include some of the most conserved sequences ever discovered, exhibiting virtual identity between human, rat, and mouse genomes. By comparing stretches of the genome between two distantly related species, such as the human and mouse, one can identify those regions that have been conserved for tens of millions of years. However, this approach won’t identify functional sequences in the human genome that have a more recent evolutionary origin. This could include, for example, genes that are present in humans and absent from mice, or regulatory regions that have changed their sequence over the course of primate evolution to allow them to bind new regulatory proteins. A concerted effort (called the ENCODE Project) is underway to identify all of the functional elements present in the human genome. Unfortunately, we lack the knowledge necessary to recognize many of these functional elements, which obviously confounds the task. One point has become clear from recent studies: a significant proportion of functional DNA sequences are constantly evolving and thus are not highly conserved. In other words, if we restrict our search to conserved sequences, we run the risk of missing many of the most important functional elements in the genome.

Chapter 10 The Nature of the Gene and the Genome

The Genetic Basis of “Being Human” By focusing on conserved sequences, we tend to learn about characteristics that we share with other organisms. If we want to better understand our own unique biological evolution, we have to look more closely at those parts of the genome that distinguish us from other organisms. Chimpanzees are our closest living relative, having shared a common ancestor as recently as 5–7 million years ago. It was thought that a detailed analysis of the differences in DNA sequence and gene organization between chimpanzees and humans might tell us a great deal about the genetic basis for recently evolved features that make us uniquely human, such as our upright walk and advanced use of tools and language. These latter characteristics can be traced to our brain, which has a volume of about 1300 cm3—nearly four times that of a chimpanzee. The rough draft of the chimpanzee genome was reported in 2005. Overall, the chimpanzee and human genomes differ by roughly 4 percent, which amounts to tens of millions of differences—a level of divergence that is considerably greater than expected from preliminary studies. While some of this divergence is due to single nucleotide changes between the two genomes, most is attributed to larger differences, such as deletions and segmental duplications (see footnote, page 407).

Researchers have been able to identify hundreds of genes in the human lineage that are evolving more quickly than the background (or neutral) rate, presumably in response to natural selection. However, it remains unclear which, if any, of these genes contribute to “making us human.” Some of the fastest evolving genes encode proteins involved in regulation of gene expression (i.e., transcription factors). These are precisely the types of genes that would be expected to generate major phenotypic differences because they can affect the expression of large numbers of other genes. In fact, differences between chimp and human transcription factors are presumably responsible for many of the differences in expression of brain proteins seen in Figure 2.48. This can be illustrated by closer examination of a brain-specific transcription factor called FOXP2. A comparison of the FOXP2 protein between humans and chimps shows two amino acid differences that have appeared in the human lineage since the time of separation from our last common ancestor. To assess the effects of these amino acid substitutions on FOXP2 function, human neuronal cells that lacked their own FOXP2 gene were engineered to express either the chimp or human version of the gene. These two cell populations were then studied in culture to assess the effects of the alternate versions of the transcription factor. It was found that more than 100 target genes were significantly upregulated or downregulated in cells expressing the human FOXP2 gene when compared to cells expressing the chimp version of the transcription factor. What makes the FOXP2 gene so interesting is that persons with mutations in the gene suffer from a severe speech and language disorder. Among other deficits, persons with this disorder are unable to perform the fine muscular movements of the lips and tongue that are required to engage in vocal communication. Calculations suggest that the changes in this “speech gene” that distinguish it from the chimp version became “fixed” in the human genome within the past 120,000 to 200,000 years, a time when modern humans are thought to have emerged. (An alteration in DNA sequence is said to be fixed if it is now present in virtually all members of the species.) These findings suggest that changes in the FOXP2 gene may have played an important role in the evolution and development of human speech. Many other genomic differences between chimps and humans have also been described, including alterations in certain RNA-encoding genes that appear to affect brain development. One of the genes in this latter category, called HAR1, stretches for a mere 118 base pairs but contains 18 base substitutions that distinguish it from the corresponding region in the chimp genome. This same region is highly conserved among other vertebrates, so it is only during the evolution of humans that it has experienced such pronounced effects of natural selection. (HAR stands for Human Accelerated Region). Although the function of HAR1 is unknown, the fact that it is expressed primarily in the developing fetal brain makes it a gene of interest to researchers trying to unravel the genomic basis of human brain expansion. Another gene of interest encodes the starch-digesting enzyme amylase that is present in saliva. Chimps have a relatively low-starch diet and have one copy of the AMY1 gene in their genome (Figure 10.28a). During the evolution of humans, there has been an apparent selection for an increased number of copies of the AMY1 gene, which has resulted in an

415

increased concentration of the amylase enzyme in human saliva. One might have predicted that an increased amylase level would have been accomplished during evolution by an increase in the level of expression of the single existing AMY1 gene in the genome, but in this particular case it has resulted from gene duplication. As seen in Figure 10.28b and discussed in the next section, the number of copies of the AMY1 gene among humans is variable. In fact, the number of copies of this gene tends to be higher in human populations that ingest greater quantities of starch in their diet. One of the most interesting questions in the area of human evolution concerns possible relationships between modern humans (i. e., Homo sapiens) and those of other “human” species (i.e., archaic members of the Homo genus that are now extinct). Humans that would be indistinguishable from our

(a)

Figure 10.28 Duplication of the amylase gene during human evolution. In the results depicted here, a pair of homologous chromosomes from a chimp (a) or a human (b) have been hybridized to red and green colored probes that bind to different portions of the AMY1 gene. This gene encodes the starch-digesting enzyme salivary amylase. The number of copies of the AMY1 gene on each chromosome is revealed by the number of times the fluorescent probes are repeated. The chimp has one copy of the AMY1 gene on each chromosome (i.e., one copy per genome), whereas humans have a number of copies. The number of copies of the gene varies within the human genome as illustrated here by the fact that one of the two homologous chromosomes from this individual has 4 copies of the gene and the other homologous chromosome has 10 copies. This is an example of a copy number variation (page 417). Moreover, the number of copies of the AMY1 gene in the genomes of a given human population tends to correlate with the amount of starch in the diet of that population. This correlation strongly suggests that the copy number of the AMY1 gene has been influenced by natural selection. (FROM GEORGE H. PERRY, ET AL., COURTESY OF NATHANIEL J. DOMINY, NATURE GEN. 39:1257, 2007. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LTD.)

10.6 Sequencing Genomes: The Footprints of Biological Evolution

(b)

present species, that is, anatomically modern humans, are thought to have first evolved in Africa approximately 200,000 years ago. But Homo sapiens are not the only members of the Homo genus to have inhabited Earth in the past 200,000 years. In fact, another member of the genus, the Neanderthals (Homo neanderthalensis), was a resident of Europe as recently as 35,000 years ago, thousands of years after the arrival of modern humans on the continent. The two species are thought to have shared some of the same regions in the Middle East at an even earlier time period. Given that both of our species inhabited similar areas and were very similar anatomically, paleontologists have wondered whether (1) the two species had interbred or (2) modern humans had simply replaced the Neanderthals without interbreeding. An answer to that question requires detailed information about the differences in DNA sequences between modern humans and Neanderthals. Beginning in the late 1990s, researchers have developed increasingly sophisticated techniques to isolate and sequence snippets of mitochondrial DNA (mtDNA) from fossil remains of Neanderthals. Mitochondrial DNA is easier to analyze from fossils than is nuclear DNA because each cell contains many copies of the mitochondrial genome and it is much smaller in size. Results of these studies suggested that the Neanderthal mtDNA sequence was sufficiently different from that of modern human mtDNA to conclude that Neanderthals went extinct without contributing genetic material to the modern human mitochondrial genome. In other words, the two species had not interbred and, moreover, had not shared a common ancestor for at least 300,000 years. By 2010, Svante Pa¨a¨bo and colleagues at the Max Planck Institute in Germany had assembled a relatively complete sequence of the Neanderthal nuclear genome and concluded that modern humans and Neanderthal genomes are 99.84 percent identical. Neanderthals, for example, had the same FOXP2 allele (discussed above) as do modern humans, which hinted that they too had verbal language skills. Moreover, a detailed comparison of the nuclear genomes from a number of different Neanderthal fossils and those of several present-day humans suggests that approximately 1 to 4 percent of the DNA of modern Europeans and Asians is derived from a Neanderthal source. Many of the genes in the Neanderthal-derived DNA contribute to the immune system’s recognition of pathogens. Acquisition of these genes may have served to protect the newly arrived humans from diseases to which the Neanderthals had previously been exposed. In contrast, the genomes of Africans show no evidence of a contribution by Neanderthals. According to this data, modern humans and Neanderthals interbred at some point in time and space after modern humans had left Africa and before they had spread into Europe and Asia. Other reports suggest that some groups of modern humans have ancient ancestors that may have interbred with other archaic human species besides Neanderthals. These findings have set the stage for some very interesting speculation concerning the human phylogenetic tree. According to some researchers who envision a number of human species sharing the planet over the past 100,000 years or so, our present population can be described as “the last humans standing.” We have focused in this section on genomic differences between humans and other primates because this is the subject of

416

the present chapter. It is probably evident from the limited nature of this discussion that researchers have not yet made much progress in identifying the genetic basis of “being human.” Many researchers believe that too much attention in this field has been focused on changes in protein-coding sequences. Instead, they would argue that changes in the regulation of gene expression have played a primary role in human evolution, but it is difficult to ascertain which of these changes have real importance.

the practice of medicine is discussed in the accompanying Human Perspective. Structural Variation As illustrated in Figure 10.29a, segments of the genome can change as the result of duplications, deletions, insertions, inversions (when a piece of DNA exists in a reversed orientation), and other events. These types of changes typically involve segments of DNA ranging from

Genetic Variation Within the Human Species Population No one else in the world looks exactly like you because no one else, other than an identical twin, has the same DNA sequence throughout their genome. The human genome that was sequenced by the original Human Genome Project was derived largely from a single human male. Since the completion of that sequence, a great deal of attention has been focused on how DNA sequence varies within the human population. Genetic polymorphisms are sites in the genome that vary among different individuals. The term usually refers to a genetic variant that occurs in at least 1 percent of a species population. The concept of genetic polymorphisms began with the discovery in 1900 by an Austrian physician, Karl Landsteiner, that people could have at least three alternate types of blood, A, B, or O. As discussed on page 130, the ABO blood group is a result of members of the population having different alleles of a gene encoding a sugar-transferring enzyme. Now that we are in possession of our own genomic sequence, we are in a position to search for new types of genetic variation that would never have been revealed in the “pregenomic era.”

A

B

C

D

E

F

G

H

I

F

G

H

Normal gene order A

B

B

C

D

E

Gene duplication (leads to copy number polymorphism) A

B

D

E

F

G

H

I

J

F

G

H

I

C

D

E

F

Deletion A

B

E

D

C Inversion

A

B

X

Y

Z

Chapter 10 The Nature of the Gene and the Genome

Insertion

DNA Sequence Variation The most common type of genetic variability in humans occurs at sites in the genome where single nucleotide differences are found among different members of the population. When present in at least 1 percent of the population, these sites are called single nucleotide polymorphisms (SNPs). The vast majority of SNPs (pronounced “snips”) occur as two alternate alleles, such as A or G. Most SNPs are thought to have arisen by mutation only once during the course of human evolution and are shared by individuals through common descent. Commonly occurring SNPs have been identified by comparing DNA sequences from the genomes of hundreds of different individuals from diverse ethnic populations. On average, two randomly selected human genomes have about 3 million single nucleotide differences between them, or one every thousand base pairs. Although this might appear to be a large number, it means that humans are, on average, 99.9 percent similar to one another with respect to nucleotide sequence, which is probably much greater than most mammalian species. This sequence similarity reflects the fact that we are apparently a young species, in evolutionary terms, and our prehistoric population size was relatively small. Of the 15 million or so SNPs in the genome, it is estimated that approximately 20,000 are found within protein-coding sequences and lead to amino acid replacements within the encoded protein. This corresponds to about one amino acid substitution per gene. It is this body of genetic variation that is thought to be largely responsible for the phenotypic variability observed among humans. The impact of human genome sequence variation on

(a)

(b)

Figure 10.29 Structural variants. (a) Schematic representation of the major types of genomic polymorphisms that involve a significant segment of a chromosome (e.g., thousands of base pairs). Most of these polymorphisms are too small to be detected by microscopic examination of chromosomes but are detected when the number and/or chromosomal location of individual genes are determined. (b) On the left is a normal human chromosome #9, and on the right is a chromosome #9 containing a large inversion that includes the chromosome’s centromere (arrow). This inversion, which is clearly visible microscopically, is present in 1–3 percent of the population. (B: FROM CHARLES LEE, NATURE GEN. 37:661, 2005. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

417

hundreds to millions of base pairs in length. Because of their relatively large size, these types of polymorphisms are called structural variants, and we are just beginning to grasp the extent and importance of their presence. It has been known since the early days of microscopic chromosome analysis that not all healthy people have identical chromosome structures. Figure 10.29b shows an example of a common chromosome inversion whose existence has been known for over 30 years. Recent studies have revealed that intermediate-sized structural variants—those that are too small to be seen under the microscope and too large to be readily detected by conventional sequence analysis—are much more common than previously thought. According to one estimate, a typical human genome carries approximately 1000 structural variants, ranging in length from about 500 bases to 1.3 million bases (Mb). Although the estimates vary considerably, it is evident that the fraction of the genome (i.e., total number of base pairs) affected by structural variation is greater than that affected by SNPs. As a result, structural variation has a significant effect on phenotypic variation among humans. Copy Number Variation As discussed on page 402, the technique of DNA fingerprinting depends on differences in the lengths of minisatellite sequences, which in turn depend on the number of copies of the sequence that are present at particular sites in the chromosomes. This is an example of a copy number variation (or CNV). Recent findings have revealed that much

T H E

H U M A N

larger-sized CNVs (⬎1 kb) are also common in the human population, affecting approximately 10–15 percent of the genome, including large numbers of protein-coding genes. These largesized CNVs are a type of structural variation that results from a duplication or deletion of a DNA segment. Because of such CNVs, many of us carry extra copies of one or more genes that encode important physiological proteins. Extra copies of genes are generally associated with overproduction of a protein, which can disrupt the delicate biochemical balance that exists within a cell. It is found, for example, that a significant number of persons who develop early-onset Alzheimer’s disease possess extra copies of the APP gene, which encodes the protein thought to be responsible for the disease (page 68). Although the subject has been controversial, there is a growing consensus among researchers that individuals with autism or schizophrenia have a greater-than-normal number of rare CNVs in their genomes. Some of the genes affected by these CNVs are involved in synapse formation and other neurological processes. In some cases, extra copies of a gene can be beneficial, as illustrated by the repeated duplication of the amylase gene (AMY1), which has been found in certain human populations that have a high starch content in their diet (Figure 10.28b). Figure 10.28b also provides an excellent example of a CNV in that it shows the number of AMY1 genes on both copies of one person’s homologous chromosomes. It can be seen that one chromosome contains only 4 copies of the AMY1 gene, whereas the homologous chromosome contains 10 copies of the gene.

P E R S P E C T I V E

Over the past several decades, more than a thousand genes have been identified as the cause of rare, heritable diseases. In the vast majority of cases, the gene responsible for the disease being investigated was identified through traditional genetic linkage studies. Such studies begin by locating a number of families whose members exhibit a disproportionately high frequency of a particular disease. The initial challenge is to determine which altered region of the genome is shared by all affected members of the family. Once a region linked to the disease gene is identified, the DNA of that segment can be isolated and the mutant gene pinpointed. This type of genetic approach is well suited for the identification of genes that have very high penetrance, that is, genes whose mutant form virtually guarantees that the individual will be afflicted with the disorder. The mutated gene responsible for Huntington’s disease (page 404), for example, exhibits very high penetrance. Furthermore, no other gene defect in the genome causes this disease. Genes that exhibit very high penetrance are said to follow Mendelian inheritance, and more than 3000 such genes have been identified. Most of the common diseases that afflict our species, such as cancer, heart failure, Alzheimer’s disease, diabetes, asthma, depression, and various age-related diseases, have a genetic component, which is to say that they show at least some tendency to run in families. But unlike inherited disorders such as Huntington’s disease, there is no single gene that is clearly linked with the condition. Instead, numerous genes are likely to affect the disease risk, with each of them making only a small contribution to the overall likelihood of develop-

ing the disorder. In addition, the development of the disorder is often influenced by nongenetic (i.e., environmental) factors. The likelihood of developing diabetes, for example, is greatly increased in individuals who are seriously overweight, and the likelihood of developing lung cancer is greatly increased by smoking. One of the goals of the medical research community is to identify genes that contribute to the development of these common, but genetically complex, diseases. Because of their low penetrance, most genes that increase the risk of developing common diseases cannot be identified through family linkage studies.a Instead, such genes are best identified by analyzing disease occurrence in large populations. To carry out this type of investigation, researchers compare the genotypes of individuals who have the disease in question with individuals of a similar ethnic background who are not afflicted. The goal is to discover an association (or correlation) between a particular condition, such as diabetes, and common genetic polymorphisms. The potential value of this approach is illustrated by the discovery in the early 1990s of a strong association between one common allele of the gene encoding the lipoprotein ApoE and the likelihood of developing Alzheimer’s a

Not all genes involved in common diseases have a low penetrance. Alzheimer’s disease, for example, occurs with high frequency in persons having certain APP alleles (page 68) and breast cancer occurs with high frequency in persons having certain BRCA alleles (Section 16.3). But these types of rare, high-penetrance polymorphisms are not major factors in the development of the large majority of cases for either of these diseases.

10.6 Sequencing Genomes: The Footprints of Biological Evolution

Application of Genomic Analyses to Medicine

Chapter 10 The Nature of the Gene and the Genome

418 disease. It was found in these studies that individuals who have at least one copy of the APOE4 allele of the gene are much more likely to develop this disabling neurodegenerative disease than persons lacking the allele. These findings have opened a major branch of investigation into the relationship between cholesterol metabolism, cardiovascular health, and Alzheimer’s disease. As the name suggests, genome-wide association studies (GWASs) look for associations between a disease state and polymorphisms that may be located anywhere in the genome. This requires determining the nucleotide sequence of large parts of the genome of all the subjects participating in the study. GWASs are predicated on the premise that common diseases are caused by common variants, that is variants present in 1 percent or more of the population. If a disease-causing variant is present at a lower frequency, it will not be detected in these types of studies. As noted earlier, the most common types of genetic variability in humans are single nucleotide differences (SNPs), which are spread almost uniformly throughout the genome. It is likely that many of the SNPs that change the coding specificity of a gene, or the regulation of a gene’s expression, play an important role in our susceptibility to complex diseases. As the cost of genotyping human DNA samples has decreased dramatically, researchers have begun scanning the genomes of large numbers of people to identify SNPs that occur more often in people afflicted with a particular disease than in healthy individuals. Even if the identified SNPs are not directly responsible for the disease, they serve as genetic markers for a nearby locus that might be responsible. These principles are well illustrated by a landmark genome-wide association study published in 2005 on age-related macular degeneration (AMD), which is the leading cause of blindness in the elderly. Initially, the researchers compared 96 AMD cases with 50 controls, looking for an association of the disease with over 100,000 SNPs that had been genotyped in the subjects. They found a strong association between individuals with the disease and a common SNP present within the intron (noncoding section) of a gene called CFH that is involved in immunity and inflammation. Persons homozygous for this SNP had a 7.4-fold greater risk of having the disease. Once they had pinned down this region of the genome as being associated with AMD, they sequenced the entire CFH gene in 96 individuals. They found that the SNP they had identified was tightly linked to a polymorphism within the coding region of the CFH gene that placed a histidine at a particular site in the protein. Other alleles of the CFH protein that were not associated with a risk for AMD had a tyrosine at this position. These findings provide additional evidence that inflammation is an underlying factor in the development of AMD and point out a clear target in the search for treatments of the disease. A large number of GWASs have been conducted over the past few years. In many of these studies, the genomes of thousands of individuals have been scrutinized, and the results have been confirmed in independent analyses of separate case and control populations. These studies have become practical with the availability of commercially produced chips that house a million or more DNA snippets containing the most common SNPs in the human population. Researchers can incubate the chip with a DNA sample to be tested and quickly determine which of the possible SNPs is present at each genomic location in that person’s DNA. Once a particular SNP is identified as being associated with a particular disease, investigators can search for a likely gene in that region of the genome that is responsible for the association (see discussion below on haplotypes). Moreover, the data generated by a number of different studies on a particular disease can be pooled, which increases the sample size and gives investigators greater statistical power for identifying disease-causing variants. As a result of these GWASs, we now have lists of genes, as well as noncoding sites that regulate gene expres-

sion, for which certain alleles or variants increase the likelihood that an individual will develop various diseases, including heart disease, Crohn’s disease, types 1 and 2 diabetes, and several types of cancer. When researchers began these association studies, it was hoped that the studies would reveal new insights into the underlying basis of common diseases by turning up genes not previously suspected of being involved in particular diseases. The products of such genes may ultimately become targets of new avenues of drug therapy, just as identification of the importance of the HMG CoA reductase gene and LDL receptor gene in the disease familial hypercholesterolemia (page 319) led to the development of the cholesterollowering statins. Although these genetic association studies have not yet led to the emergence of new drugs, they have uncovered important mechanisms and pathways involved in the development of numerous common diseases. To cite just one example, several of the susceptibility alleles that contribute to type 2 diabetes encode proteins that function in the insulin-secreting cells of the pancreas, which has focused renewed attention on the dysfunction of these cells in the development of this disease. (Readers interested in examining the results of these studies can find them at the userfriendly site, www.snpedia.com) Despite the fact that GWASs have turned up many hundreds of common genetic variants that contribute to dozens of common diseases, they have also left many clinical geneticists disappointed. For many years, we have known the degree to which heredity contributes to the overall susceptibility of most common diseases. This measure has been obtained primarily by comparing the incidence of a given disease in members of the same family, as compared to members of the general population. When the results of the recent GWASs are examined, the vast majority of the common “susceptibility alleles” that have been identified lead to only a small increase in the risk of developing a particular disease, typically less than 1.5 times the odds without the allele. If one adds up the increased risk a person would bear if he or she carried all of the identified susceptibility alleles for a particular disease, such as Crohn’s disease or type 2 diabetes, it does not come close to accounting for the contribution that genetics is known to play in the development of that disease. In other words, there is a great deal of “missing heritability” that is yet to be discovered. It was initially thought that much of the missing heritability might be accounted for by structural variations (page 416), which are not subject to detection in association studies that rely on SNPs. However, as techniques have been developed to map the structural variants in the human genome, this notion now also seems unlikely. Many factors could be responsible for the fact that most of the genetic risk factors for common diseases have yet to be identified. For example, a very large number of common alleles may contribute such a small risk (i.e., have a very low penetrance) that they have not been detected in these GWASs. If this turns out to be the case, then identification of such low-penetrant variants may be of little practical value. On the opposite end of the scale, there may be a large number of rare variants that significantly raise the risk of disease (i.e., they are moderately penetrant). However, because they are rare (e.g., 0.1 to 1 percent frequency), this group of variants would also have been overlooked in the current GWASs. Another explanation of much of the missing heritability is a phenomenon called epistasis, which describes the observation that the presence of a particular allele at one genetic locus can affect the expression of an allele at a different locus. In other words, genes can interact with one another, and thus researchers may have to consider the role of specific combinations of alleles as determinants of particular phenotypes. It is difficult enough to determine the effects of individual variants on a particular disease outcome without attempting to measure the effects of combinations of variants. Despite the present difficulties, it may be possible, one day, to determine whether a person is genetically predisposed to develop a

419 SNP

Common haplotypes

Haplotype

“Person A”

“Person B”

Figure 1 The genome is divided into blocks (haplotypes). The top line shows a hypothetical segment of DNA containing a number of SNPs (each SNP is indicated by a filled circle). This particular segment consists of five haplotypes separated by short stretches of DNA that are highly variable. Each haplotype occurs as a small number of variants. Each haplotype shown here exists as three to six different variants. Each haplotype variant is characterized by a specific set of SNPs, indicated by the colored circles. All of the SNPs of a particular haplotype variant are drawn in the same color to indicate that they are inherited as a group and are found together in different members of the population. Each person represented at the bottom of the illustration has a particular combination of haplotypes on each of their two chromosomes. Some haplotype variants are found in many different ethnic groups; others have a more restricted distribution.

mapping the various haplotypes that exist within the human population. The HapMap would contain all of the common haplotypes (i.e., haplotypes present in at least 5 percent of the population) that were present along the length of each of the 24 different human chromosomes in 270 members of four ethnically different populations. The populations to be examined were the Yoruba of Africa, Han Chinese, Japanese, and Western European (descendants who had settled in Utah). The project was completed in 2006 and resulted in publication of a HapMap built on more than four million tag SNPs spaced evenly throughout the genome. Now that the HapMap is available, researchers are able to identify associations between a particular disease and a particular haplotype. Once such an association is made, the region of the genome containing the haplotype can be scrutinized for the particular gene(s) that contribute to disease susceptibility. The analysis of HapMaps has also provided data on the origins and migrations of human populations and has yielded valuable insights into the factors that have shaped the human genome. This is illustrated by the following example. Whether or not we are able to drink milk as an adult without upsetting our stomach (i.e., whether or not we are lactose tolerant) depends on which alleles of the lactase gene we carry in our genome. Lactose tolerance is associated with an unusually long haplotype that contains a variation within a regulatory site that causes the lactase gene to be persistently expressed into adulthood. This particular haplotype is present at high frequency in European populations, which have had a long history of raising dairy animals, and is rare in most sub-Saharan African and Southeast Asian populations, which historically did not raise such animals. Its high frequency among Europeans suggests that this particular haplotype was under strong positive selective pressure in populations that depended on dairy products for nutrition. This single haplo-

10.6 Sequencing Genomes: The Footprints of Biological Evolution

particular disease simply by identifying which nucleotides are present at key positions within that person’s genome. Information about genetic variation may also provide an indication as to how persons will react to a particular drug, whether they are likely to be helped by it and/or whether they may develop serious side effects. To cite just one example, individuals who carry an allele with two specific SNPs in a gene encoding the enzyme TPMT are unable to metabolize a class of thiopurine drugs that are commonly used to treat a type of childhood leukemia. Persons who are homozygous for this allele are at very high risk of developing a life-threatening suppression of their bone marrow when treated with normal doses of thiopurine drugs. Most diseases can be treated by a number of alternate medications, so that proper drug selection is an important aspect of the practice of medicine. The pharmaceutical industry is hoping that SNP data will eventually lead to an era of “customized drug therapy,” allowing physicians to prescribe specific drugs at specific doses that are tailored to each individual patient based on their genetic profile. This era may have begun with the FDA approval of a DNA-containing chip that allows physicians to screen a patient’s DNA to determine which variants of two different cytochrome P-450 genes they possess. These genes help determine how efficiently a person is able to metabolize various drugs, ranging from painkillers and antidepressants to over-the-counter heartburn medications. The use of SNP data in association studies presents a formidable challenge because of the sheer number of these polymorphisms within the genome. As researchers began to study the distribution of SNPs in different human populations, they made a striking discovery that has made association studies much simpler: large blocks of SNPs have been inherited together as a unit over the course of recent human evolution. To illustrate: if you were told the particular base that resided at each of a handful of polymorphic sites in a particular region of a chromosome, then the bases at all of the other polymorphic sites in the region could be predicted with reasonably high (e.g., 90 percent) accuracy. Stretches of SNPs are thought to have remained together in the genome over a large number of generations because genetic recombination (i.e., crossing over) does not occur randomly along the DNA, as was suggested in the discussion of gene mapping on page 391. Instead, there are short segments of DNA (1–2 kb) where recombination is likely to occur and blocks of DNA between these “hot spots” that have a low frequency of recombination. As a result, certain blocks of DNA (typically about 20 kilobases long) tend to remain intact as they are transmitted from generation to generation. These blocks are called haplotypes. Haplotypes are like “giant multigenic alleles”; if you select a particular site on a particular chromosome, only a limited number of alternate haplotypes can be found in that region (Figure 1). Each of the alternate haplotypes is defined by a small number of SNPs (called “tag SNPs”) in that region of the genome. Once the identity of a handful of tag SNPs within a haplotype has been determined, the identity of the entire haplotype is known. The average length of haplotypes and the number of alternate versions of each haplotype vary among different ethnic populations. African populations exhibit the greatest variety of haplotypes, in keeping with other studies suggesting that the modern human species arose in Africa (page 415). Persons of Northern European descent have much longer haplotypes, with fewer alternate versions, than persons of African descent. This finding suggests that presentday Europeans have arisen from a relatively small population in which many of the haplotypes of their African ancestors had been lost. According to one “calculated guess,” a founding population of less than 500 individuals emigrated out of Africa around 60,000 years ago to give rise to all of the human populations that now reside on all of the other continents. In 2002 about 25 groups of investigators began a collaboration, called the International HapMap Project, aimed at identifying and

420 type is over 1 million bases long, which indicates that it has not been together as a block long enough for crossing over to have had a chance to break it into smaller segments. In fact, it is estimated that this haplotype was subjected to strong selective pressure between 5,000 and 10,000 years ago, about the time that dairy farming is thought to have arisen. It is interesting to note that several East African populations that engage in dairy farming also exhibit persistent lactase expression. The mutations responsible for the lactose-tolerant phenotype in these Africans are different from the mutation found in Europeans, indicating that the trait evolved independently in the two groups and represents a good example of convergent evolution within the human lineage. If we look ahead to the future, the greatest advances in “genomic medicine” are likely to spring from the current progress being made in DNA-sequencing technology. The Human Genome Project, which involved the efforts of more than a thousand researchers working for several years and led to the first sequence of the human genome, is estimated to have cost in the neighborhood of one billion dollars. By 2011, the price of sequencing an entire human genome had dropped below $10,000. This dramatic reduction in cost and effort has been made possible by the development of a new generation of methods to

determine DNA sequences. These efforts are expected to continue over the next few years and to culminate in the development of routine laboratory procedures for sequencing an individual’s genome for a few thousand dollars or less. If this goal is achieved, we will all be able to carry around a CD containing a list of the letters that make up our complete genotype. Even if no one is interested in learning about his or her unique genetic identity, researchers should be able to use the information obtained to identify virtually all of the sites in the genome that contribute to human disease. In fact, an international collaboration began in 2008 to compare the complete genome sequences of at least one thousand individuals. By the end of the pilot phase in 2010, the team had sequenced the whole genomes of 179 individuals and the protein-coding segments (collectively called the exome) of 697 individuals. It is estimated, based on the data, that each of us carries, on average, about 250 alleles that have been rendered nonfunctional due to the presence of mutations. The “1000 Genome Project” should allow researchers to make direct associations between a particular genomic variant and a particular trait instead of having to rely on the indirect GWASs that identify only the tag SNPs of the HapMap (which are not the actual causative defect). Readers can follow the progress of the project at www.1000genomes.org

E X P E R I M E N TA L

P AT H W AY S

Chapter 10 The Nature of the Gene and the Genome

The Chemical Nature of the Gene Three years after Gregor Mendel presented the results of his work on inheritance in pea plants, Friedrich Miescher graduated from a Swiss medical school and traveled to Tübingen, Germany, to spend a year studying under Ernst Hoppe-Seyler, one of the foremost chemists (and possibly the first biochemist) of the period. Miescher was interested in the chemical contents of the cell nucleus. To isolate material from cell nuclei with a minimum of contamination from cytoplasmic components, Miescher needed cells that had large nuclei and were easy to obtain in quantity. He chose white blood cells, which he obtained from the pus in surgical bandages that were discarded by a local clinic. Miescher treated the cells with dilute hydrochloric acid to which he added an extract from pig’s stomach that removed protein (the stomach extract contained the proteolytic enzyme pepsin). The residue from this treatment was composed primarily of isolated cell nuclei that settled to the bottom of the vessel. Miescher then extracted the nuclei with dilute alkali. The alkalisoluble material was further purified by precipitation with dilute acid and reextraction with dilute alkali. Miescher found that the alkaline extract contained a substance that had properties unlike any previously discovered: the molecule was very large, acidic, and rich in phosphorus. He called the material “nuclein.” His year up, Miescher returned home to Switzerland, while Hoppe-Seyler, who was cautious about the findings, repeated the work. With the results confirmed, Miescher published the findings in 1871.1 Back in Switzerland, Miescher continued his studies of the chemistry of the cell nucleus. Living near the Rhine River, Miescher had ready access to salmon that had swum upstream and were ripe with eggs or sperm. Sperm were ideal cells to study nuclei. Like white blood cells, they could be obtained in large quantity, and 90 percent of their volume was occupied by nuclei. Miescher’s nuclein preparations from sperm cells contained a higher percentage of phosphorus (almost 10 percent by weight) than those from white blood cells, indicating they had less contaminating protein. In fact,

they were the first preparations of relatively pure DNA. The term nucleic acid was coined in 1889 by Richard Altmann, one of Miescher’s students, who worked out the methods of purifying protein-free DNA from various animal tissues and yeast.2 During the last two decades of the nineteenth century, numerous biologists focused on the chromosomes, describing their behavior during and between cell division (page 388). One way to observe chromosomes was to stain these cellular structures with dyes. A botanist named E. Zacharias discovered that the same stains that made the chromosomes visible also stained a preparation of nuclein that had been extracted using Miescher’s procedure of digestion with pepsin in an HCl medium. Furthermore, when the pepsin–HCl-extracted cells were subsequently extracted with dilute alkali, a procedure that was known to remove nuclein, the cell residue (which included the chromosome remnants) no longer contained stainable material. These and other results pointed strongly to nuclein as a component of the chromosomes. In one remarkably farsighted proposal, Otto Hertwig, who had been studying the behavior of the chromosomes during fertilization, stated in 1884: “I believe that I have at least made it highly probable that nuclein is the substance that is responsible not only for fertilization but also for the transmission of hereditary characteristics.”3 Ironically, as more was learned about the properties of nuclein, the less it was considered as a candidate for the genetic material. During the 50 years that followed Miescher’s discovery of DNA, the chemistry of the molecule and the nature of its components were described. Some of the most important contributions in this pursuit were made by Phoebus Aaron Levene, who immigrated from Russia to the United States in 1891 and eventually settled in a position at the Rockefeller Institute in New York. It was Levene who finally solved one of the most resistant problems of DNA chemistry when he determined in 1929 that the sugar of the nucleotides was 2-deoxyribose.4 To isolate the sugar, Levene and E. S. London placed the DNA into the stomach of a dog through a surgical opening and then collected the

421

Live encapsulated (S) cells, type I

Live capsuleless (R) cells, type II

Heat-killed S cells, type I

Figure 1 The large glistening colonies on the right are virulent S-type pneumococci, whereas the smaller colonies on the left are nonvirulent R-type pneumococci. As discussed below, the cells in these particular S colonies are the result of transformation of R bacteria by DNA from heat-killed S pneumococci. (FROM O. T. AVERY, C. M. MACLEOD, AND M. MCCARTY, J. EXP. MED. 79:153, 1944. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

cells of another type. Each of the three types of pneumococcus could occur as either the R or S form. In 1928, Griffith made a surprising discovery while injecting various bacterial preparations into mice.8 Injections of large numbers of heat-killed S bacteria or small numbers of living R bacteria, by themselves, were harmless to the mouse. However, if he injected both of these preparations together into the same mouse, it contracted pneumonia and died. Virulent bacteria could be isolated from the mouse and cultured. To extend the findings, he injected combinations of bacteria of different types (Figure 2). Initially, eight mice were injected with heat-killed, type I, S bacteria together with a small inoculum of a live, type II, R strain. Two of the eight animals contracted pneumonia, and Griffith was able to isolate and culture virulent, type I, S bacterial cells from the infected mice. Because there was no possibility that the heat-killed bacteria had been brought back to life, Griffith concluded that the dead type I cells had provided something to the live, nonencapsulated, type II cells that transformed them into an encapsulated type I form. The transformed bacteria continued to produce type I cells when grown in culture; thus the change was stable and permanent. Griffith’s finding of transformation was rapidly confirmed by several laboratories around the world, including that of Oswald Avery, an immunologist at the Rockefeller Institute, the same institution where Levene was working. Avery had initially been skeptical of the idea that a substance released by a dead cell could alter the

Heat-killed S cells, type I mixed with live R cells, type II

+

Inject into mouse

Mouse dies

Inject into mouse

Mouse survives

Inject into mouse

Mouse survives

Inject into mouse

Mouse dies

Figure 2 Outline of the experiment by Griffith of the discovery of bacterial transformation.

10.6 Sequencing Genomes: The Footprints of Biological Evolution

sample from the animal’s intestine. As it passed through the stomach and intestine, various enzymes of the animal’s digestive tract acted on the DNA, carving the nucleotides into their component parts, which could then be isolated and analyzed. Levene summarized the state of knowledge on nucleic acids in a monograph published in 1931.5 While Levene is credited for his work in determining the structure of the building blocks of DNA, he is also credited as having been the major stumbling block in the search for the chemical nature of the gene. Through this period, it became increasingly evident that proteins were very complex and exhibited great specificity in catalyzing a remarkable variety of chemical reactions. DNA, on the other hand, was thought to be composed of a monotonous repeat of its four nucleotide building blocks. The major proponent of this view of DNA, which was called the tetranucleotide theory, was Phoebus Levene. Because chromosomes consisted of only two components— DNA and protein—there seemed little doubt at that time that protein was the genetic material. Meanwhile, as the structure of DNA was being worked out, a seemingly independent line of research was being carried out in the field of bacteriology. During the early 1920s, it was found that a number of species of pathogenic bacteria could be grown in the laboratory in two different forms. Virulent bacteria, that is, bacterial cells capable of causing disease, formed colonies that were smooth, dome shaped, and regular. In contrast, nonvirulent bacterial cells grew into colonies that were rough, flat, and irregular (Figure 1).6 The British microbiologist J. A. Arkwright introduced the terms smooth (S) and rough (R) to describe these two types. Under the microscope, cells that formed S colonies were seen to be surrounded by a gelatinous capsule, whereas the capsule was absent from cells in the R colonies. The bacterial capsule helps protect a bacterium from its host’s defenses, which explains why the R cells, which lacked these structures, were unable to cause infections in laboratory animals. Because of its widespread impact on human health, the bacterium responsible for causing pneumonia (Streptococcus pneumoniae, or simply pneumococcus) has long been a focus of attention among microbiologists. In 1923, Frederick Griffith, a medical officer at the British Ministry of Health, demonstrated that pneumococcus also grew as either S or R colonies and, furthermore, that the two forms were interconvertible; that is, on occasion an R bacterium could revert to an S form, or vice versa.7 Griffith found, for example, if he injected exceptionally large numbers of R bacteria into a mouse, the animal frequently developed pneumonia and produced bacteria that formed colonies of the S form. It had been shown earlier that pneumococcus occurred as several distinct types (types I, II, and III) that could be distinguished from one another immunologically. In other words, antibodies could be obtained from infected animals that would react with only one of the three types. Moreover, a bacterium of one type never gave rise to

Chapter 10 The Nature of the Gene and the Genome

422 appearance of a living cell, but he was convinced when Martin Dawson, a young associate in his lab, confirmed the results.9 Dawson went on to show that transformation need not occur within a living animal host. A crude extract of the dead S bacteria, when mixed in bacterial culture with a small number of the nonvirulent (R) cells in the presence of anti-R serum, was capable of converting the R cells into the virulent S form. The transformed cells were always of the type (I, II, or III) characteristic of the dead S cells.10 The next major step was taken by J. Lionel Alloway, another member of Avery’s lab, who was able to solubilize the transforming agent. This was accomplished by rapidly freezing and thawing the killed donor cells, then heating the disrupted cells, centrifuging the suspension, and forcing the supernatant through a porcelain filter whose pores blocked the passage of bacteria. The soluble, filtered extract possessed the same transforming capacity as the original heat-killed cells.11 For the next decade, Avery and his colleagues focused their attention on purifying the substance responsible for transformation and determining its identity. As remarkable as it may seem today, no other laboratory in the world was pursuing the identity of the “transforming principle,” as Avery called it. Progress on the problem was slow.12 Eventually, Avery and his co-workers, Colin MacLeod and Maclyn McCarty, succeeded in isolating a substance from the soluble extract that was active in causing transformation when present at only 1 part per 600 million. All the evidence suggested that the active substance was DNA: (1) it exhibited a host of chemical properties that were characteristic of DNA, (2) no other type of material could be detected in the preparation, and (3) tests of various enzymes indicated that only those enzymes that digested DNA inactivated the transforming principle. The paper published in 1944 was written with scrupulous caution and made no dramatic statements that genes were made of DNA rather than protein.13 The paper drew remarkably little attention. Maclyn McCarty, one of the three authors, describes an incident in 1949 when he was asked to speak at Johns Hopkins University along with Leslie Gay, who had been testing the effects of the new drug Dramamine for the treatment of sea sickness. The large hall was packed with people, and “after a short period of questions and discussion following [Gay’s] paper, the president of the Society got up to introduce me as the second speaker. Very little that he said could be heard because of the noise created by people streaming out of the hall. When the exodus was complete, after I had given the first few minutes of my talk, I counted approximately thirty-five hardy souls who remained in the audience because they wanted to hear about pneumococcal transformation or because they felt they had to remain out of courtesy.” But Avery’s awareness of the potential of his discovery was revealed in a letter he wrote in 1943 to his brother Roy, also a bacteriologist: If we are right, & of course that’s not yet proven, then it means that nucleic acids are not merely structurally important but functionally active substances in determining the biochemical activities and specific characteristics of cells—& that by means of a known chemical substance it is possible to induce predictable and hereditary changes in cells. This is something that has long been the dream of geneticists. . . . Sounds like a virus— may be a gene. But with mechanisms I am not now concerned—one step at a time. . . . Of course the problem bristles with implications. . . . It touches genetics, enzyme chemistry, cell metabolism & carbohydrate synthesis—etc. But today it takes a lot of well documented evidence to convince anyone that the sodium salt of deoxyribose nucleic acid, protein free, could possibly be endowed with such biologically active & specific properties & that evidence we are now trying to get. It’s lots of fun to blow bubbles,—but it’s wiser to prick them yourself before someone else tries to.

Many articles and passages in books have dealt with the reasons why Avery’s findings were not met with greater acclaim. Part of the reason may be due to the subdued manner in which the paper was written and the fact that Avery was a bacteriologist, not a geneticist. Some biologists were persuaded that Avery’s preparation must have been contaminated with minuscule amounts of protein and that the contaminant, not the DNA, was the active transforming agent. Others questioned whether studies on transformation in bacteria had any relevance to the field of genetics. During the years following the publication of Avery’s paper, the climate in genetics changed in an important way. The existence of the bacterial chromosome was recognized, and a number of prominent geneticists turned their attention to these prokaryotes. These scientists believed that knowledge gained from the study of the simplest cellular organisms would shed light on the mechanisms that operate in the most complex plants and animals. In addition, the work of Erwin Chargaff on the base composition of DNA shattered the notion that DNA was a molecule consisting of a simple repetitive series of nucleotides (discussed on page 394).14 This finding awakened researchers to the possibility that DNA might have the properties necessary to fulfill a role in information storage. Seven years after the publication of Avery’s paper on bacterial transformation, Alfred Hershey and Martha Chase of the Cold Spring Harbor Laboratories in New York turned their attention to an even simpler system—bacteriophages, or viruses that infect bacterial cells. By 1950, researchers recognized that viruses also had a genetic program. Viruses injected their genetic material into the host bacterium, where it directed the formation of new virus particles inside the infected cell. Within a matter of minutes, the infected cell broke open, releasing new bacteriophage particles, which then infected neighboring host cells. It was clear that the genetic material directing the formation of viral progeny had to be either DNA or protein because these were the only two molecules the virus contained. Electron microscopic observations showed that, during the infection, the bulk of the bacteriophage remains outside the cell, attached to the cell surface by tail fibers (Figure 3). Hershey and Chase reasoned that the virus’s

Figure 3 Electron micrograph of a bacterial cell infected by T4 bacteriophage. Each phage is seen attached by its tail fibers to the outer surface of the bacterium, while new phage heads are being assembled in the host cell’s cytoplasm. (LEE D. SIMON/PHOTO RESEARCHERS, INC.)

423 genetic material must possess two properties. First, if the material were to direct the development of new bacteriophages during infection, it must pass into the infected cell. Second, if the material carries inherited information, it must be passed on to the next generation of bacteriophages. Hershey and Chase prepared two batches of bacteriophages to use for infection (Figure 4). One batch contained radioactively labeled DNA ([32P]DNA); the other batch contained radioactively labeled protein ([35S]protein). Because DNA lacks sulfur (S) atoms, and protein usually lacks phosphorus (P) atoms, these two radioisotopes provided distinguishing labels for the two types of macromolecules. Their experimental plan was to allow one or the other type of bacteriophage to infect a population of bacterial cells, wait a few minutes, and then strip the empty viruses from the surfaces of the cells. After trying several methods to separate the bacteria from the attached phage coats, they found this was best accomplished by subjecting the infected suspension to the spinning blades of a Waring blender. Once the virus particles had been detached from the cells, the bacteria could be centrifuged to the bottom of the tube, leaving the empty viral coats in suspension. By following this procedure, Hershey and Chase determined the amount of radioactivity that entered the cells versus that which remained behind in the empty coats. They found that when cells were infected with protein-labeled bacteriophages, the bulk of the radioactivity remained in the empty coats. In contrast, when cells were infected with DNA-labeled bacteriophages, the bulk of the radioactivity passed inside the host cell. When they monitored the radioactivity passed onto the next generation, they found that less than

1 percent of the labeled protein could be detected in the progeny, whereas approximately 30 percent of the labeled DNA could be accounted for in the next generation. The publication of the Hershey-Chase experiments in 1952,15 together with the abandonment of the tetranucleotide theory, removed any remaining obstacles to the acceptance of DNA as the genetic material. Suddenly, tremendous new interest was generated in a molecule that had largely been ignored. The stage was set for the discovery of the double helix.

References 1. MIESCHER, J. F. 1871. Hoppe-Seyler’s Med. Chem. Untersuchungen. 4:441. 2. ALTMANN, R. 1889. Anat. u. Physiol., Physiol. Abt. 524. 3. Taken from MIRSKY, A. E. 1968. The discovery of DNA. Sci. Am. 218:78–88. ( June) 4. LEVENE, P. A. & LONDON, E. S. 1929. The structure of thymonucleic acid. J. Biol. Chem. 83:793–802. 5. LEVENE, P. A. & BASS, L. W. 1931. Nucleic Acids. The Chemical Catalog Co. 6. ARKWRIGHT, J. A. 1921. Variation in bacteria in relation to agglutination both by salts and by specific serum. J. Path. Bact. 24:36–60. 7. GRIFFITH, F. 1923. The influence of immune serum on the biological properties of pneumococci. Rep. Public Health Med. Subj. 18:1–13. 8. GRIFFITH, F. 1928. The significance of pneumococcal types. J. Hygiene 27:113–159. 9. DAWSON, M. H. 1930. The transformation of pneumoccal types. J. Exp. Med. 51:123–147.

Shear off adsorbed particles in blender

+

+ ~ 80% of radioactivity

~ 20% of radioactivity

(a)

Shear off adsorbed particles in blender

+

+ ~ 20% of radioactivity Unlabeled protein Labeled protein

Less than 1% of radioactivity transferred to progeny

~ 80% of radioactivity

Unlabeled DNA Labeled DNA (b)

Figure 4 The Hershey-Chase experiment showing that bacterial cells infected with phage containing 32P-labeled DNA (red DNA molecules) became radioactively labeled and produced labeled phage

progeny. In contrast, bacterial cells infected with phage containing 35 S-labeled protein (red phage coats) did not become radioactively labeled and produced only nonlabeled progeny.

10.6 Sequencing Genomes: The Footprints of Biological Evolution

Phage progeny produced from bacteria. Considerable % of original radioactivity present in progeny

424 10. DAWSON, M. H. & SIA, R.H.P. 1931. In vitro transformation of pneumococcal types. J. Exp. Med. 54:701–710. 11. ALLOWAY, J. L. 1932. The transformation in vitro of R pneumococci into S forms of different specific types by use of filtered pneumococcus extracts. J. Exp. Med. 55:91–99. 12. MCCARTY, M. 1985. The Transforming Principle: Discovering That Genes Are Made of DNA. Norton. 13. AVERY, O. T., MACLEOD, C. M., & MCCARTY, M. 1944. Studies on the

chemical nature of the substance inducing transformation of pneumococcal types: Induction of transformation by a deoxyribonucleic acid fraction isolated from pneumococcus type III. J. Exp. Med. 79:137–158. 14. CHARGAFF, E. 1950. Chemical specificity of nucleic acids and mechanism of their enzymic degradation. Experentia 6:201–209. 15. HERSHEY, A. D. & CHASE, M. 1952. Independent functions of viral protein and nucleic acid in growth of bacteriophage. J. Gen. Physiol. 36:39–56.

Chapter 10 The Nature of the Gene and the Genome

| Synopsis Chromosomes are the carriers of genetic information. A number of early observations led biologists to consider the genetic role of chromosomes. These included observations of the precision with which chromosomes are divided between daughter cells during cell division; the realization that the chromosomes of a species remained constant in shape and number from one division to the next; the finding that embryonic development required a particular complement of chromosomes; and the observation that the chromosome number is divided in half prior to formation of the gametes and is doubled following union of a sperm and egg at fertilization. Mendel’s findings provided biologists with a new set of criteria for identifying the carriers of the genes. Sutton’s studies of gamete formation in the grasshopper revealed the existence of homologous chromosomes, the association of homologues during the cell divisions that preceded gamete formation, and the separation of the homologues during the first of these meiotic divisions. (p. 388) If genes are packaged together on chromosomes that are passed from parents to offspring, then genes on the same chromosome should be linked to one another to form a linkage group. The existence of linkage groups was confirmed in various systems, particularly in fruit flies where dozens of mutations were found to assort into four linkage groups that correspond in size and number to the chromosomes present in the cells of these insects. At the same time, it was discovered that linkage was incomplete; that is, the alleles originally present on a given chromosome did not necessarily remain together during formation of the gametes, but could be reshuffled between maternal and paternal homologues. This phenomenon, which Morgan called crossing over, was proposed to be the result of breakage and reunion of segments of homologous chromosomes that occurred while homologues were physically associated during the first meiotic division. Analyses of offspring from matings between adults carrying a variety of mutations on the same chromosome indicated that the frequency of recombination between two genes provided a measure of the relative distance that separates those two genes. Thus, recombination frequencies could be used to prepare detailed maps of the serial order of genes along each chromosome of a species. Genetic maps of the fruit fly based on recombination frequencies were verified independently by examination of the locations of various bands in the giant polytene chromosomes found in certain larval tissues of these insects. (p. 389) Experiments discussed in the Experimental Pathways provided conclusive evidence that DNA was the genetic material. DNA is a helical molecule consisting of two chains of nucleotides running in opposite directions with their backbones on the outside and the nitrogenous bases facing inward. Adenine-containing nucleotides on one strand always pair with thymine-containing nucleotides on the other strand, likewise for guanine- and cytosine-containing nucleotides. As a result, the two strands of a DNA molecule are complementary to one another. Genetic information is encoded in the specific linear sequence of nucleotides that makes up the strands. The

Watson-Crick model of DNA structure suggested a mechanism of replication that included strand separation and the use of each strand as a template that directed the order of nucleotide assembly during construction of the complementary strand. The mechanism by which DNA governed the assembly of a specific protein remained a total mystery. The DNA molecule depicted in Figure 10.10 is in a relaxed state having 10 base pairs per turn of the helix. DNA within a cell tends to be underwound (contains a greater number of base pairs per turn) and is said to be negatively supercoiled, a condition that facilitates the separation of strands during replication and transcription. The supercoiled state of DNA is altered by topoisomerases, enzymes that are able to cut, rearrange, and reseal DNA strands. (p. 393) All of the genetic information present in a single haploid set of chromosomes of an organism constitutes that organism’s genome. The variety of DNA sequences that make up the genome and the number of copies of these various sequences describe the complexity of the genome. Understanding the complexity of a genome has grown out of early studies showing that the two strands that make up a DNA molecule can be separated by heat; when the temperature of the solution is lowered, complementary single strands are capable of reassociating to form stable, double-stranded DNA molecules. Analysis of the kinetics of this reassociation process provides a measure of the concentration of complementary sequences, which in turn provides a measure of the variety of sequences that are present within a given quantity of DNA. The greater the number of copies of a particular sequence in the genome, the greater is its concentration and the faster it reanneals. (p. 398) When DNA fragments from eukaryotic cells are allowed to reanneal, the curve typically shows three steps, which correspond to the reannealing of three different classes of DNA sequences. The highly repeated fraction consists of short DNA sequences that are repeated in great number; these include satellite DNAs situated at the centromeres of the chromosomes, minisatellite DNAs, and microsatellite DNAs. The latter two groups tend to be highly variable, causing certain inherited diseases and forming the basis of DNA fingerprint techniques. The moderately repeated fraction includes DNA sequences that encode ribosomal and transfer RNAs and histone proteins, as well as various sequences with noncoding functions. The nonrepeated fraction contains protein-coding genes, which are present in one copy per haploid set of chromosomes. (p. 400) The sequence organization of the genome is capable of change, either slowly over the course of evolution or rapidly as the result of transposition. The genes encoding eukaryotic proteins are often members of multigene families whose members show evidence that they have evolved from a common ancestral gene. The first step in this process is thought to be the duplication of a gene, which probably occurs primarily by unequal crossing over. Once duplication has occurred, nucleotide substitutions would be expected to modify various members in different ways, producing a family of repeated

425 sequences of similar but nonidentical structure. The globin genes, for example, consist of clusters of genes located on two different chromosomes. Each cluster contains a number of related genes that code for globin polypeptides produced at different stages in the life of an animal. The clusters also contain pseudogenes, which are homologous to the globin genes, but are nonfunctional. (p. 406) Certain DNA sequences are capable of moving rapidly from place to place in the genome by transposition. These transposable elements are called transposons, and they are capable of integrating themselves randomly throughout the genome. The best studied transposons occur in bacteria. They are characterized by inverted repeats at their termini, an internal segment that codes for a transposase required for their integration, and the formation of short repeated sequences in the target DNA that flank the element at the site of integration. Eukaryotic transposable elements are capable of moving by several mechanisms. In most cases, the element is transcribed into RNA, which is copied by a reverse transcriptase encoded by the element, and the DNA copy is integrated into the target site. The mod-

erately repeated fraction of human DNA contains two large families of transposable elements, the Alu and L1 families. (p. 408) Over the past decade or so, large numbers of genomes, both prokaryotic and eurkaryotic, have been sequenced. These efforts have provided remarkable insight into the arrangement and evolution of both protein-coding and noncoding components of the genome. The human genome contains only about 21,000 protein-coding genes, but many more polypeptides can be synthesized as the result of gene-enhancing activities, such as alternative splicing. Information about the functional elements of the human genome can be gained by comparing the sequence with homologous sections of other genomes to determine those portions that are conserved and those that tend to undergo sequence diversification. Conserved sequences are presumed to have a function, either in a coding or regulatory capacity. Chimpanzee and human genomes differ by about 4 percent. Comparison of sequences in the two genomes provides information on genes that have been subjected to positive selection in the human lineage and underlie some of our unique human characteristics. (p. 411).

| Analytic Questions 8. Suppose you had two solutions of DNA, one single stranded

traits he studied were encoded by genes that resided close to one another on one of the chromosomes of a pea plant? 2. Sutton was able to provide visual evidence for Mendel’s law of segregation. Why would it have been impossible for him visually to confirm or refute Mendel’s law of independent assortment? 3. Genes X, Y, and Z are all located on one chromosome. Draw a simple map showing the gene order and relative distances among the genes X, Y, and Z using the data below:

and the other double stranded, with equivalent absorbance of ultraviolet light. How would the concentrations of DNA compare in these two solutions? What type of labeling pattern would you expect after in situ hybridization between mitotic chromosomes and labeled myoglobin mRNA? Between the same chromosomes and labeled histone DNA? According to Chargaff ’s determination of base composition, which of the following would characterize any sample of DNA? (1) [A] ⫹ [T] ⫽ [G] ⫹ [C]; (2) [A]/[T] ⫽ 1; (3) [G] ⫽ [C]; (4) [A] ⫹ [G] ⫽ [T] ⫹ [C]. If the C content of a preparation of double-stranded DNA is 15 percent, what is the A content? If 30 percent of the bases on a single strand of DNA are T, then 30 percent of the bases on that strand are A. True or false? Why? Would you agree with the statement that the Tm represents the temperature at which 50 percent of the DNA molecules in a solution are single stranded? Suppose you were to find a primate that had a ␤-globin gene with only one intervening sequence. Do you think this animal may have evolved from a primitive ancestor that split away from other animals prior to the appearance of the second intron? In 1996, a report was published in the journal Lancet on the level of trinucleotide expansion in the DNA of blood donors. These researchers found that the level of trinucleotide expansion decreased significantly with age. Can you provide any explanation for these findings? Suppose you were looking at a particular SNP in the genome where half the population had an adenine (A) and the other half had a guanine (G). Is there any way that you might be able to determine which of these variants was present in ancestral humans and which became frequent during human evolution? Look over Figures 10.23 and 10.28. The arrangement of genes depicted in the former figure has evolved over hundreds of millions of years, whereas that in the latter figure has evolved over thousands of years. Discuss the likely possibilities for the future evolution of the human amylase genes.

Crossover Frequency 36% 10% 26%

Between These Genes X-Z Y-Z X-Y

4. Alleles on opposite ends of a chromosome are so likely to be

separated by crossing over between them that they segregate independently. How would one be able to determine, using genetic crosses, that these two genes belong to the same linkage group? How might this be established using nucleic acid hybridization? 5. Where in the curve of Figure 10.17 would you expect to see the reannealing of the DNA that codes for ribosomal RNA? Suppose you had a pure preparation of DNA fragments whose sequences code for ribosomal RNA. Draw the reannealing of this DNA superimposed over the curve for total DNA shown in Figure 10.17. 6. Approximately 5 percent of the present human genome consists of segmental duplications that have arisen during the past 35 million years. How do you suppose researchers are able to estimate how long it has been since a particular region of a chromosome was duplicated? 7. It was noted on page 409 that at least 45 percent of the human genome is derived from transposable elements. The actual number could be much higher, but it is impossible to make a determination about the origin of many other sequences. Why do you suppose it is difficult to make assignments about the origin of many of the sequences in the human genome?

9.

10.

11. 12.

13.

14.

15.

16.

Analytic Questions

1. How would Mendel’s results have differed if two of the seven

426

11 Gene Expression: From Transcription to Translation 11.1 The Relationship between Genes, Proteins, and RNAs 11.2 An Overview of Transcription in Both Prokaryotic and Eukaryotic Cells 11.3 Synthesis and Processing of Eukaryotic Ribosomal and Transfer RNAs 11.4 Synthesis and Processing of Eukaryotic Messenger RNAs 11.5 Small Regulatory RNAs and RNA Silencing Pathways 11.6 Encoding Genetic Information 11.7 Decoding the Codons: The Role of Transfer RNAs 11.8 Translating Genetic Information THE HUMAN PERSPECTIVE: Clinical Applications of RNA Interference EXPERIMENTAL PATHWAYS: The Role of RNA as a Catalyst

I

n many ways, progress in biology over the past century is reflected in our changing concept of the gene. As the result of Mendel’s work, biologists learned that genes are discrete elements that govern the appearance of specific traits. Given the opportunity, Mendel might have argued for the concept that one gene determines one trait. Boveri, Weismann, Sutton, and their contemporaries discovered that genes have a physical embodiment as parts of the chromosome. Morgan, Sturtevant, and colleagues demonstrated that genes have specific addresses—they reside at particular locations on particular chromosomes, and these addresses remain constant from one individual of a species to the next. Griffith, Avery, Hershey, and Chase demonstrated that genes were composed of DNA, and Watson and Crick solved the puzzle of DNA structure, which explained how this remarkable macromolecule could encode heritable information. Although the formulations of these concepts were milestones along the path to genetic understanding, none of them addressed the mechanism by which the information stored in a gene is put to work in governing cellular activities. This is the major topic to be discussed in the present chapter. We will begin with additional insights into the nature of a gene, which brings us closer to its role in the expression of inherited traits.

A model of the large subunit of a prokaryotic ribosome at 2.4 Å resolution as determined by X-ray crystallography. This view looks into the subunit’s active site cleft, which consists entirely of RNA. RNA is shown in gray, protein in gold, and the active site is revealed by a bound inhibitor (green). (COURTESY OF THOMAS A. STEITZ, YALE UNIVERSITY.)

427

11.1 | The Relationship between Genes, Proteins, and RNAs

with other early observations of basic importance in genetics, Garrod’s findings went unappreciated for decades. The idea that genes direct the production of enzymes was resurrected in the 1940s by George Beadle and Edward Tatum of the California Institute of Technology. They studied Neurospora, a tropical bread mold that grows in a very simple medium containing a single organic carbon source (e.g., a sugar), inorganic salts, and biotin (a B vitamin). Because it needs so little to live on, Neurospora was presumed to synthesize all of its required metabolites. Beadle and Tatum reasoned that an organism with such broad synthetic capacity should be very sensitive to enzymatic deficiencies, which should be easily detected using the proper experimental protocol. A simplified outline of their protocol is given in Figure 11.1. Beadle and Tatum’s plan was to irradiate mold spores and screen them for mutations that caused cells to lack a particular enzyme. To screen for such mutations, irradiated spores were tested for their ability to grow in a minimal medium that lacked the essential compounds known to be synthesized by

The first meaningful insight into gene function was gained by Archibald Garrod, a Scottish physician who reported in 1908 that the symptoms exhibited by persons with certain rare inherited diseases were caused by the absence of specific enzymes. One of the diseases investigated by Garrod was alcaptonuria, a condition readily diagnosed because the urine becomes dark on exposure to air. Garrod found that persons with alcaptonuria lacked an enzyme in their blood that oxidized homogentisic acid, a compound formed during the breakdown of the amino acids phenylalanine and tyrosine. As homogentisic acid accumulates, it is excreted in the urine and darkens in color when oxidized by air. Garrod had discovered the relationship between a genetic defect, a specific enzyme, and a specific metabolic condition. He called such diseases “inborn errors of metabolism.” As seems to have happened

Wild-type Neurospora

1

2

Irradiate with X-rays or ultraviolet light

Grow on supplemented medium

Spores

Spores produced by meiosis followed by a single mitosis

Meiosis

Minimal medium

Test growth capability 4

Minimal medium + amino acids

Mutants are able to grow on supplemented medium, but not on minimal medium

5

Minimal medium supplemented with:

Pyridoxine p-Amino benzoic acid

Choline

Inositol

Folic acid

Figure 11.1 The Beadle-Tatum experiment for the isolation of genetic mutants in Neurospora. Spores were irradiated to induce mutations (step 1) and allowed to grow into colonies in tubes containing supplemented medium (step 2). Genetically identical spores produced by the colonies were then tested for their ability to grow on either supplemented or minimal medium (step 3). Those that failed to grow on minimal medium were mutants, and the task was to identify the mutant gene. In the example shown in step 4, a sample of mutant cells

Pantothenic acid

Niacin

Riboflavin Thiamine Control on minimal medium

was found to grow in the minimal medium supplemented with vitamins, but would not grow in medium supplemented with amino acids. This observation indicates a deficiency in an enzyme leading to the formation of a vitamin. In step 5, growth of these same cells in minimal medium supplemented with one or another of the vitamins indicates the deficiency resides in a gene involved in the formation of pantothenic acid (part of coenzyme A).

11.1 The Relationship between Genes, Proteins, and RNAs

Minimal medium + vitamins

3

428

Chapter 11 Gene Expression: From Transcription to Translation

this organism (Figure 11.1). If a spore is unable to grow in minimal medium but a genetically identical spore can grow in a medium supplemented with a particular coenzyme (e.g., pantothenic acid of coenzyme A), then the researchers could conclude that the cells have an enzymatic deficiency that prevents them from synthesizing this essential compound. Beadle and Tatum began by irradiating over a thousand cells. Two of the spores proved unable to grow on the minimal medium: one needed pyridoxine (vitamin B6), and the other needed thiamine (vitamin B1). Eventually, the progeny of about 100,000 irradiated spores were tested, and dozens of mutants were isolated. Each mutant had a gene defect that resulted in an enzyme deficiency that prevented the cells from catalyzing a particular metabolic reaction. The results were clear-cut: a gene carries the information for the construction of a particular enzyme. This conclusion became known as the “one gene–one enzyme” hypothesis. Once it was learned that enzymes are often composed of more than one polypeptide, each of which is encoded by its own gene, the concept became modified to “one gene–one polypeptide.” Although this relationship remains a close approximation of the basic function of a gene, it too has had to be modified owing to the discovery that a single gene often generates a variety of polypeptides, primarily as the result of alternative splicing (discussed in Section 12.5). It would also become apparent that describing a gene strictly as an information store for polypeptides is far too narrow of a definition. Many genes encode RNA molecules that, rather than containing information for polypeptide synthesis, function as RNAs in their own right. With this in mind, it might be best to define a gene as a segment of DNA that contains the information for either a single polypeptide chain or for one, or more, functional RNAs.

An Overview of the Flow of Information through the Cell The information present in a segment of DNA is made available to the cell through formation of a molecule of RNA. The synthesis of an RNA from a DNA template is called transcription. The term transcription denotes a process in which

Figure 11.2 An overview of the flow of information in a eukaryotic cell. The DNA of the chromosomes located within the nucleus contains the entire store of genetic information. Selected sites on the DNA are transcribed into pre-mRNAs (step 1), which are processed into messenger RNAs (step 2). The messenger RNAs are transported out of the nucleus (step 3) into the cytoplasm, where they are translated into polypeptides by ribosomes that move along the mRNA (step 4). Following translation, the polypeptide folds to assume its native conformation (step 5).

the information encoded in the four deoxyribonucleotide letters of DNA is rewritten, or transcribed, into a similar language composed of four ribonucleotide letters of RNA. We will examine the mechanism of transcription shortly, but first we will continue with the role of a gene in the formation of polypeptides. Studies carried out in the 1950s uncovered the relationship between genetic information and amino acid sequence, but this knowledge by itself provided no clue as to the mechanism by which the specific polypeptide chain is generated. As we now know, there is an intermediate between a gene and its polypeptide; the intermediate is messenger RNA (mRNA). The momentous discovery of mRNA was made in 1961 by François Jacob and Jacques Monod of the Pasteur Institute in Paris, Sydney Brenner of the University of Cambridge, and Matthew Meselson of the California Institute of Technology. A messenger RNA is assembled as a complementary copy of one of the two DNA strands that make up a gene. Because its nucleotide sequence is complementary to that of the gene from which it is transcribed, the mRNA retains the same information for polypeptide assembly as the gene itself. For this reason, an mRNA can also be described as a “sense” strand or a coding RNA. Eukaryotic mRNAs are not synthesized in their final (or mature) form but, instead, must be carved out (or processed) from much larger pre-mRNAs. An overview of the role of mRNA in the flow of information through a eukaryotic cell is illustrated in Figure 11.2. The use of messenger RNA allows the cell to separate information storage from information utilization. While the gene remains stored away in the nucleus as part of a huge, stationary DNA molecule, its information can be imparted to a much smaller, mobile nucleic acid that passes into the cytoplasm. Once in the cytoplasm, the mRNA can serve as a template to direct the incorporation of amino acids in a particular order encoded by the nucleotide sequence of the DNA and mRNA. The use of a messenger RNA also allows a cell to greatly amplify its synthetic output. One DNA molecule can serve as the template in the formation of many mRNA molecules, each of which can be used in the formation of a large number of polypeptide chains.

DNA of chromosome

Segment of DNA being transcribed

Pre-mRNAs

Cytoplasm

mRNA being translated

Protein

1 2 3

4

mRNAs Nucleus

5

429

Proteins are synthesized in the cytoplasm by a complex process called translation. Translation requires the participation of dozens of different components, including ribosomes. Ribosomes are complex, cytoplasmic “machines” that can be programmed, like a computer, to translate the information encoded by any mRNA. Ribosomes contain both protein and RNA. The RNAs of a ribosome are called ribosomal RNAs (or rRNAs), and like mRNAs, each is transcribed from one of the DNA strands of a gene. Rather than functioning in an informational capacity, rRNAs provide structural support and catalyze the chemical reaction in which amino acids are covalently linked to one another. Transfer RNAs (or tRNAs) constitute a third major class of RNA that is required during protein synthesis. Transfer RNAs are required to translate the information in the mRNA nucleotide code into the amino acid “alphabet” of a polypeptide. Both rRNAs and tRNAs owe their activity to their complex secondary and tertiary structures. Unlike DNA, which has a similar double-stranded, helical structure regardless of the source, many RNAs fold into a complex three-dimensional shape, which is markedly different from one type of RNA to another. Thus, like proteins, RNAs carry out a diverse array of functions because of their different shapes. As with proteins, the folding of RNA molecules follows certain rules. Whereas protein folding is driven by the withdrawal of hydrophobic residues into the interior, RNA folding is driven by the forma-

tion of regions having complementary base pairs (Figure 11.3). As seen in Figure 11.3, base-paired regions typically form double-stranded (and double-helical) “stems,” which are connected to single-stranded “loops.” Unlike DNA, which consists exclusively of standard Watson-Crick (G-C, A-T) base pairs, RNAs often contain nonstandard base pairs (inset, Figure 11.3) and modified nitrogenous bases. These unorthodox regions of the molecule serve as recognition sites for proteins and other RNAs, promote RNA folding, and help stabilize the structure of the molecule. The importance of complementary base-pairing extends far beyond tRNA and rRNA structure. As we will see throughout this chapter, basepairing between RNA molecules plays a central role in most of the activities in which RNAs are engaged. The roles of mRNAs, rRNAs, and tRNAs are explored in detail in the following sections of this chapter. Eukaryotic cells make a host of other RNAs, which also play vital roles in cellular metabolism; these include small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), small interfering RNAs (siRNAs), piRNAs, microRNAs (miRNAs), and a variety of other noncoding RNAs, that is, RNAs that do not contain information for amino acid sequences. Most of these RNAs serve a regulatory function, controlling various aspects of gene expression. Several of these RNAs have burst onto the scene in the past few years, reminding us once again that there are many hidden cellular activities waiting to be discovered. For background material on the structure of RNA, you might review page 77.

REVIEW

11.2 | An Overview of Transcription in Both Prokaryotic and Eukaryotic Cells G U A G G G C G U C A

mA

Figure 11.3 Two-dimensional structure of a bacterial ribosomal RNA showing the extensive base-pairing between different regions of the single strand. The expanded section shows the base sequence of a stem and loop, including a nonstandard base pair (G-U) and a modified nucleotide, methyladenosine. One of the helices is shaded differently because it plays an important role in ribosome function as discussed on page 470. An example of the three-dimensional structure of an RNA is shown in Figure 11.43b.

Transcription is a process in which a DNA strand provides the information for the synthesis of an RNA strand. The enzymes responsible for transcription in both prokaryotic and eukaryotic cells are called DNA-dependent RNA polymerases, or simply RNA polymerases. These enzymes are able to incorporate nucleotides, one at a time, into a strand of RNA whose sequence is complementary to one of the DNA strands, which serves as the template. The first step in the synthesis of an RNA is the association of the polymerase with the DNA template. This brings up a matter of more general interest, namely, the specific interactions of two very different macromolecules, proteins and nucleic acids. Just as different proteins have evolved to bind different types of substrates and catalyze different types of reactions, so too have some of them evolved to recognize and

11.2 An Overview of Transcription in Both Prokaryotic and Eukaryotic Cells

1. How were Beadle and Tatum able to conclude that a gene encoded a specific enzyme? 2. Distinguish between the two-dimensional and threedimensional structure of RNAs.

430

bind to specific sequences of nucleotides in a strand of nucleic acid. The site on the DNA to which an RNA polymerase molecule binds prior to initiating transcription is called the pro-

moter. Cellular RNA polymerases are not capable of recognizing promoters on their own but require the help of additional proteins called transcription factors. In addition to providing

RNA polymerase 3' end

Upstream DNA (previously transcribed)

Active site

3' 5' Negatively supercoiled

Positively supercoiled Transcription bubble ppp

Mg2+ ion

Nascent RNA 5' end

3'

(a)

DNA–RNA hybrid –

5' Terminus of RNA

5'

O

O P

O

–O

O P

–

O

O

–

O

P

O O

O

O O –

O –

O

CH2

P

O

–O

P

OH

O

O

O

O

P –O

(b)

CH2 O O

O

Downstream DNA (to be transcribed)

A

Pi

H2O

–O

U

O

–

O

O–

P

NTP

H-bond

–

P

Mg2+

O

CH2 O– PPi

O–

P

G OH

C CH2

OH

O O

Chapter 11 Gene Expression: From Transcription to Translation

O –O

P –

O

O

P –

O

O

O O

P –

O

O

CH2 A

O OH

O–

P

T CH2

OH

O Growing RNA chain (5' 3')

DNA O template

P

O–

G 3'

(c)

Figure 11.4 Chain elongation during transcription. (a) A schematic model of the elongation of a newly synthesized RNA molecule during transcription. The polymerase covers approximately 35 base pairs of DNA, the transcription bubble composed of single-stranded (melted) DNA contains about 15 base pairs, and the segment present in a DNA–RNA hybrid includes about 9 base pairs. The enzyme generates overwound (positively supercoiled) DNA ahead of itself and underwound (negatively supercoiled) DNA behind itself (page 397). These conditions are relieved by topoisomerases (page 398). (b) Schematic drawing of an RNA polymerase in the act of transcription elongation. The downstream DNA lies in a groove within the polymerase, clamped by a pair of jaws formed by the two largest subunits of the enzyme. The DNA makes a sharp turn in the region of

5'

(d)

the active site, so that the upstream DNA extends upward in this drawing. The nascent RNA exits from the enzyme’s active site through a separate channel. (c) Chain elongation results following an attack by the 3⬘ OH of the nucleotide at the end of the growing strand on the 5⬘ ␣-phosphate of the incoming nucleoside triphosphate. The pyrophosphate released is subsequently cleaved, which drives the reaction in the direction of polymerization. The geometry of base-pairing between the nucleotide of the template strand and the incoming nucleotide determines which of the four possible nucleoside triphosphates is incorporated into the growing RNA chain at each site. (d) Electron micrograph of several RNA polymerase molecules bound to a phage DNA template. (D: COURTESY OF ROBLEY C. WILLIAMS.)

431

a binding site for the polymerase, the promoter contains the information that determines which of the two DNA strands is transcribed and the site at which transcription begins. RNA polymerase moves along the template DNA strand toward its 5⬘ end (i.e., in a 3⬘ S 5⬘ direction). As the polymerase progresses, the DNA is temporarily unwound, and the polymerase assembles a complementary strand of RNA that grows from its 5⬘ terminus in a 3⬘ direction (Figure 11.4a,b). RNA polymerase catalyzes the highly favorable reaction RNAn ⫹ NTP S RNAn⫹1 ⫹ PPi in which ribonucleoside triphosphate substrates (NTPs) are cleaved into nucleoside monophosphates as they are polymerized into a covalent chain (Figure 11.4c). Reactions leading to the synthesis of nucleic acids (and proteins) are inherently different from those of intermediary metabolism discussed in Chapter 3. Whereas some of the reactions leading to the formation of small molecules, such as amino acids, may be close enough to equilibrium that a considerable reverse reaction can be measured, those reactions leading to the synthesis of nucleic acids and proteins must occur under conditions in which there is virtually no reverse reaction. This condition is met during transcription with the aid of a second favorable reaction

chemical methodologies that tend to average out differences between individual protein molecules. Consequently, researchers have developed techniques to follow the activities of single RNA polymerase molecules similar to those used to study individual cytoskeletal motors. Two examples of such studies are depicted in Figure 11.5. In both of these examples, a single RNA polymerase is attached to the surface of a glass coverslip and allowed to transcribe a DNA molecule containing a fluorescent bead covalently linked to one of its ends. The movement of the fluorescent bead can be monitored under a fluorescence microscope. In Figure 11.5a, the bead is free to move in the medium, and its range of movement is proportional to the length of the DNA between the polymerase and the bead. As the polymerase transcribes the template, the connecting DNA strand is elongated, and the movement of the bead is increased. This type of system allows investigators to study the rate of transcription of an individual polymerase and to determine if the polymerase transcribes the DNA in a steady or discontinuous movement. Bead

PPi S 2Pi DNA RNA 5' Glass surface RNA polymerase

(a)

Focused laser beam DNA

Bead

RNA 5'

(b)

RNA polymerase

Figure 11.5 Examples of experimental techniques to follow the activities of single RNA polymerase molecules. (a) In this protocol, the RNA polymerase molecule is attached to the underlying coverslip and allowed to transcribe a DNA molecule containing a fluorescent bead at the upstream end. The arrows indicate the movement of the DNA through the polymerase. The rate of movement and progression of the polymerase can be followed by observing the position of the bead over time using a fluorescence microscope. (b) In this protocol, the attached polymerase is transcribing a DNA molecule with a bead at its downstream end. As in a, the arrows indicate direction of DNA movement. The bead is caught in an optical (laser) trap, which delivers a known force that can be varied by adjusting the laser beam. This type of apparatus can measure the forces generated by a transcribing polymerase. (REPRINTED FROM J. GELLES AND R. LANDICK, CELL 93:15, 1998, COPYRIGHT 1998, WITH PERMISSION FROM ELSEVIER SCIENCE.)

11.2 An Overview of Transcription in Both Prokaryotic and Eukaryotic Cells

catalyzed by a different enzyme, a pyrophosphatase. In this case, the pyrophosphate (PPi) produced in the first reaction is hydrolyzed to inorganic phosphate (Pi). The hydrolysis of pyrophosphate releases a large amount of free energy and makes the nucleotide incorporation reaction essentially irreversible. As the polymerase moves along the DNA template, it incorporates complementary nucleotides into the growing RNA chain. A nucleotide is incorporated into the RNA strand if it is able to form a proper (Watson-Crick) base pair with the nucleotide in the DNA strand being transcribed. This can be seen in Figure 11.4c where the incoming adenosine 5⬘-triphosphate pairs with the thymine-containing nucleotide of the template. Once the polymerase has moved past a particular stretch of DNA, the DNA double helix re-forms (as in Figure 11.4a,b). Consequently, the RNA chain does not remain associated with its template as a DNA–RNA hybrid (except for about nine nucleotides just behind the site where the polymerase is operating). RNA polymerases are capable of incorporating from about 20 to 50 nucleotides into a growing RNA molecule per second, and many genes in a cell are transcribed simultaneously by a hundred or more polymerases (as seen in Figure 11.11c). The frequency at which a gene is transcribed is tightly regulated and can vary dramatically depending on the given gene and the prevailing conditions. The electron micrograph of Figure 11.4d shows a molecule of phage DNA with a number of bound RNA polymerase molecules. RNA polymerases are capable of forming prodigiously long RNAs. Consequently, the enzyme must remain attached to the DNA over long stretches of template (the enzyme is said to be processive). At the same time, the enzyme must be associated loosely enough that it can move from nucleotide to nucleotide of the template. It is difficult to study certain properties of RNA polymerases, such as processivity, using bio-

432

In Figure 11.5b, the bead at the end of the DNA molecule being transcribed is trapped by a focused laser beam (page 143). The minute force exerted by the laser trap can be varied, until it is just sufficient to stop the polymerase from continuing to transcribe the DNA. Measurements carried out on single RNA polymerase molecules in the act of transcription indicate that these enzymes can move over the template, one base (3.4 Å) at a time, with a force several times that of a myosin molecule. Even though polymerases are relatively powerful motors, these enzymes do not necessarily move in a steady, continuous fashion but may pause at certain locations along the template or even backtrack before resuming their forward progress. A number of elongation factors have been identified that enhance the enzyme’s ability to traverse these various roadblocks. In some cases, as occurs following the rare incorporation of an incorrect nucleotide, a stalled polymerase must backtrack and then digest away the 3⬘ end of the newly synthesized transcript and resynthesize the missing portion before continuing its movement. The ability of RNA polymerase to recognize and remove a misincorporated nucleotide is referred to as “proofreading.” This corrective function is carried out by the same active site within the enzyme that is responsible for nucleotide incorporation. At this point in the discussion, it is useful to examine the differences in the process of transcription between bacterial and eukaryotic cells.

Core enzyme

+

Loose association between DNA and core enzyme. RNA chains that are begun are not initiated at proper sites. (a) Complete enzyme (holoenzyme)

(b) Association of complete enzyme with DNA at proper site and opening of double helix

Chapter 11 Gene Expression: From Transcription to Translation

Transcription in Bacteria Bacteria, such as E. coli, contain a single type of RNA polymerase composed of five subunits that are tightly associated to form a core enzyme. If the core enzyme is purified from bacterial cells and added to a solution of bacterial DNA molecules and ribonucleoside triphosphates, the enzyme binds to the DNA and synthesizes RNA. The RNA molecules produced by a purified polymerase, however, are not the same as those found within cells because the core enzyme has attached to random sites in the DNA (Figure 11.6a), sites that it would normally have ignored in vivo. If, however, a purified accessory polypeptide called sigma factor (␴) is added to the RNA polymerase before it attaches to DNA, transcription begins at selected locations (Figure 11.6b-d ). Attachment of ␴ factor to the core enzyme increases the enzyme’s affinity for promoter sites in DNA and decreases its affinity for DNA in general. As a result, the complete enzyme is thought to slide freely along the DNA until it recognizes and binds to a suitable promoter region. X-ray crystallographic analysis of the bacterial RNA polymerase (see Figure 11.8) reveals a molecule shaped like a “crab claw” with a pair of mobile pincers (or jaws) enclosing a positively charged internal channel. As the sigma factor interacts with the promoter, the jaws of the enzyme grip the downstream DNA duplex, which resides within the channel (as in Figure 11.4b). The enzyme then separates (or melts) the two DNA strands in the region surrounding the start site (Figure 11.6c). Strand separation makes the template strand accessible to the enzyme’s active site, which resides at the back wall of the channel. Initiation of transcription appears to be a difficult undertaking because an RNA polymerase typically makes several unsuccessful attempts to assemble an RNA transcript. Once

Sigma (σ) factor

+

-35

-10 Initiation start site

(c)

Loss of sigma factor as RNA chain is elongated

5' (d)

Figure 11.6 Schematic representation of the initiation of transcription in bacteria. (a) In the absence of the ␴ factor, the core enzyme does not interact with the DNA at specific initiation sites. (b–d) When the core enzyme is associated with the ␴ factor, the complete enzyme (or holoenzyme) is able to recognize and bind to the promoter regions of the DNA, separate the strands of the DNA double helix, and initiate transcription at the proper start sites (see Figure 11.7). In the traditional model shown here, the ␴ factor dissociates from the core enzyme, which is capable of transcription elongation. Several studies suggest that, in at least some cases, ␴ may remain with the polymerase.

about ten nucleotides have been successfully incorporated into a growing transcript, the enzyme undergoes a major change in conformation and is transformed into a transcriptional elongation complex that can move processively along the DNA. In the model shown in Figure 11.6d, the formation of an elongation complex is followed by release of the sigma factor.

433 Nontemplate strand A A T T T G A C A

3'

T A T

Nascent RNA P P P A X chain A T A Template strand

A A C T G T

Upstream –35 sequence

A

T T A

–10 sequence

+1

Downstream

Start point

sequence is reached. In roughly half of the cases, a ringshaped protein called rho is required for termination of bacterial transcription. Rho encircles the newly synthesized RNA and moves along the strand in a 5⬘ S 3⬘ direction to the polymerase, where it separates the RNA transcript from the DNA to which it is bound. In other cases, the polymerase stops transcription when it reaches a terminator sequence. Terminator sequences typically fold into a hairpin loop that causes the RNA polymerase to release the completed RNA chain without requiring additional factors.

Transcription and RNA Processing in Eukaryotic Cells As discovered in 1969 by Robert Roeder at the University of Washington, eukaryotic cells have three distinct transcribing enzymes in their cell nuclei. Each of these enzymes is responsible for synthesizing a different group of RNAs (Table 11.1). Plants have two additional RNA polymerases that are not essential for life. No prokaryote has been found with multiple RNA polymerases, whereas the simplest eukaryotes (yeast) have the same three nuclear types that are present in mammalian cells. This difference in number of RNA polymerases is another sharp distinction between prokaryotic and eukaryotic cells. Figure 11.8a shows the surface structures of RNA polymerases from each of the three domains of life: archaea, bacteria, and eukaryotes. Several features are apparent from a close examination of this illustration. It is evident that the archaeal and eukaryotic enzymes are more similar in structure to one another than are the bacterial and archaeal enzymes. This feature reflects the evolutionary relationship between archaea

Table 11.1 Eukaryotic Nuclear RNA Polymerases Enzyme

RNAs Synthesized

RNA polymerase I RNA polymerase II

larger rRNAs (28S, 18S, 5.8S) mRNAs, most small nuclear RNAs (snRNAs and snoRNAs), most microRNAs, and telomerase RNA other small RNAs, including tRNAs, 5S rRNA, and U6 snRNA siRNAs

RNA polymerase III RNA Polymerase IV, V (Plants only)

11.2 An Overview of Transcription in Both Prokaryotic and Eukaryotic Cells

As noted above, promoters are the sites in DNA that bind RNA polymerase. Bacterial promoters are located in the region of a DNA strand just preceding the initiation site of RNA synthesis (Figure 11.7). The nucleotide at which transcription is initiated is denoted as ⫹1 and the preceding nucleotide as ⫺1. Those portions of the DNA preceding the initiation site (toward the 3⬘ end of the template) are said to be upstream from that site. Those portions of the DNA succeeding it (toward the 5⬘ end of the template) are said to be downstream from that site. Analysis of the DNA sequences just upstream from a large number of bacterial genes reveals that two short stretches of DNA are similar from one gene to another. One of these stretches is centered at approximately 35 bases upstream from the initiation site and typically occurs as the sequence TTGACA (Figure 11.7). This TTGACA sequence (known as the ⫺35 element) is called a consensus sequence, which indicates that it is the most common version of a conserved sequence, but that some variation occurs from one gene to another. The second conserved sequence is found approximately 10 bases upstream from the initiation site and occurs at the consensus sequence TATAAT (Figure 11.7). This site in the promoter, named the Pribnow box after its discoverer, is responsible for identifying the precise nucleotide at which transcription begins. As the sigma factor recognizes the Pribnow box, amino acid residues within the protein interact with each of the six nucleotides of the TATAAT sequence of the nontemplate strand. Two of these nucleotides are flipped out of the nucleotide stack and into the core of the protein. This action likely initiates melting of the adjoining region of promoter DNA and formation of the transcription bubble seen in Figures 11.6 and 11.7. Bacterial cells possess a variety of different ␴ factors that recognize different versions of the promoter sequence. The ␴70 is known as the “housekeeping” ␴ factor, because it initiates transcription of most genes. Alternative ␴ factors initiate transcription of a small number of specific genes that participate in a common response. For example, when E. coli cells are subjected to a sudden rise in temperature, a new ␴ factor is synthesized that recognizes a different promoter sequence and leads to the coordinated transcription of a battery of heat-shock genes. The products of these genes protect the proteins of the cell from thermal damage (page 80). Just as transcription is initiated at specific points in the chromosome, it also terminates when a specific nucleotide

T

5'

Figure 11.7 The basic elements of a promoter region in the DNA of the bacterium E. coli. The key regulatory sequences required for initiation of transcription are found in regions located at ⫺35 and ⫺10 base pairs from the site at which transcription is initiated. The initiation site marks the boundary between the ⫹ and ⫺ sides of the gene.

434

(a)

Chapter 11 Gene Expression: From Transcription to Translation

(b)

Figure 11.8 A comparison of prokaryotic and eukaryotic RNA polymerase structure. (a) RNA polymerases from the three domains of life. Each subunit of an enzyme is denoted by a different color and labeled according to conventional nomenclature for that enzyme. Homologous subunits are depicted by the same color. It can be seen that the archaeal and eukaryotic polymerases are more similar in structure to one another than are the bacterial and eukaryotic enzymes. RNA polymerase II (shown here) is only one of three major eukaryotic nuclear RNA polymerases. (b) Ribbon diagram of the core structure of yeast RNA polymerase II. Regions of the bacterial polymerase that are structurally homologous to the yeast enzyme are shown in green. The large channel that grips the downstream DNA is evident. A divalent Mg2⫹ion, situated at the end of the channel and within the active site, is seen as a red sphere. (A: FROM AKIRA HIRATA, BRIANNA J. KLEIN, AND KATSUHIKO S. MURAKAMI, NATURE 451:852, 2008. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LTD.; B: FROM PATRICK CRAMER, ET AL., SCIENCE 292:1874, 2001, FIG. 12B. © 2001 REPRINTED WITH PERMISSION FROM AAAS. COURTESY ROGER KORNBERG, STANFORD UNIVERSITY SCHOOL OF MEDICINE.)

and bacteria that was discussed back on page 29. The subunits that make up each of the proteins depicted in Figure 11.8a are indicated by a different color, and it is immediately evident that RNA polymerases are multisubunit enzymes. Those subunits between the different enzymes that are homologous to one another (i.e., are derived from a common ancestral polypeptide) are shown in the same color. Thus, it is also evident from Figure 11.8a that the RNA polymerases from the

three domains share a number of subunits. Although the yeast enzyme shown in Figure 11.8a has a total of 12 subunits, 7 more than its bacterial counterpart, the fundamental core structure of the two enzymes is virtually identical. This evolutionary conservation in RNA polymerase structure is revealed in Figure 11.8b, which shows the homologous regions of the bacterial and eukaryotic enzymes at much higher resolution. Our understanding of transcription in eukaryotes was greatly advanced with the 2001 publication of the X-ray crystallographic structure of yeast RNA polymerase II by Roger Kornberg and colleagues at Stanford University. As a result of these studies, and those of other laboratories in subsequent years, we now know a great deal about the mechanism of action of RNA polymerases as they move along the DNA, transcribing a complementary strand of RNA. A major distinction between transcription in prokaryotes and eukaryotes is the requirement in eukaryotes for a large variety of accessory proteins, or transcription factors. These proteins play a role in virtually every aspect of the transcription process, from the binding of the polymerase to the DNA template, to the initiation of transcription, to its elongation and termination. Although transcription factors are crucial for the operation of all three types of eukaryotic RNA polymerases, they will only be discussed in regard to the synthesis of mRNAs by RNA polymerase II (page 442). All three major types of eukaryotic RNAs—mRNAs, rRNAs, and tRNAs—are derived from precursor RNA molecules that are considerably longer than the final RNA product. The initial precursor RNA is equivalent in length to the full length of the DNA transcribed and is called the primary transcript, or pre-RNA. The corresponding segment of DNA from which a primary transcript is transcribed is called a transcription unit. Primary transcripts do not exist within the cell as naked RNA but become associated with proteins even as they are synthesized. Primary transcripts typically have a fleeting existence, being processed into smaller, functional RNAs by a series of “cut-and-paste” reactions. RNA processing requires a variety of small RNAs (90 to 300 nucleotides long) and their associated proteins. In the following sections, we will examine the activities associated with the transcription and processing of each of the major eukaryotic RNAs.

REVIEW 1. What is the role of a promoter in gene expression? Where are the promoters for bacterial polymerases located? 2. Describe the steps during initiation of transcription in bacteria. What is the role of the s factor? What is the nature of the reaction in which nucleotides are incorporated into a growing RNA strand? How is the specificity of nucleotide incorporation determined? What is the role of pyrophosphate hydrolysis? 3. How do the number of RNA polymerases distinguish prokaryotes and eukaryotes? What is the relationship between a pre-RNA and a mature RNA?

435

11.3 | Synthesis and Processing of Eukaryotic Ribosomal and Transfer RNAs A eukaryotic cell may contain millions of ribosomes, each consisting of several molecules of rRNA together with dozens of ribosomal proteins. The composition of a mammalian ribosome is shown in Figure 11.9. Ribosomes are so numerous that more than 80 percent of the RNA in most cells consists of ribosomal RNA. To furnish the cell with such a large number of transcripts, the DNA sequences encoding rRNA are normally repeated hundreds of times. This DNA, called rDNA, is typically clustered in one or a few regions of the genome. The human genome has five rDNA clusters, each on a different chromosome. In a nondividing (interphase) cell, the clusters of rDNA are gathered together as part of one or more irregularly shaped nuclear structures, called nucleoli (singular, nucleolus), that function in producing ribosomes (Figure 11.10a). The bulk of a nucleolus is composed of nascent ribosomal subunits that give the nucleolus a granular appearance (Figure 11.10b). Embedded within this granular mass are one or more rounded cores consisting primarily of fibrillar material. As

Eukaryotic (mammalian) ribosome 33 ribosomal proteins

49 ribosomal proteins

10 µM (a)

Dense fibrillar component (dfc)

5S rRNA

Granular component (gc) (b)

5.8S rRNA 18S rRNA

28S rRNA 60S subunit

40S subunit

24 nm 80S ribosome

Figure 11.9 The macromolecular composition of a mammalian ribosome. This schematic drawing shows the components present in each of the subunits of a mammalian ribosome. The synthesis and processing of the rRNAs and the assembly of the ribosomal subunits are discussed in the following pages. (FROM D. P. SNUSTAD ET AL., PRINCIPLES OF GENETICS; COPYRIGHT © 1997, JOHN WILEY & SONS, INC. REPRINTED BY PERMISSION OF JOHN WILEY & SONS, INC.)

Figure 11.10 The nucleolus. (a) Light micrograph of two human HeLa cells transfected with a gene for a ribosomal protein fused to the green fluorescent protein (GFP). The fluorescent ribosomal protein can be seen in the cytoplasm where it is synthesized and ultimately functions, and in the nucleoli (white arrows), where it is assembled into ribosomes. (b) Electron micrograph of a section of part of a nucleus with a nucleolus. Three distinct nucleolar regions can be distinguished morphologically. The bulk of the nucleolus consists of a granular component (gc), which contains ribosomal subunits in various stages of assembly. Embedded within the granular regions are fibrillar centers (fc) that are surrounded by a more dense fibrillar component (dfc). The inset shows a schematic drawing of these parts of the nucleolus. According to one model, the fc contains the DNA that codes for ribosomal RNA, and the dfc contains the nascent pre-rRNA transcripts and associated proteins. According to this model, transcription of the pre-rRNA precursor takes place at the border between the fc and dfc. (Note: Nucleoli have other functions unrelated to ribosome biogenesis that are not discussed in this text.) Bar, 1 ␮m. (A: FROM C. E. LYON AND A. I. LAMOND, CURR. BIOL. 10:R323, 2000, FIG. B, WITH PERMISSION FROM ELSEVIER; B: FROM PAVEL HOZAK, ET AL., J. CELL SCIENCE 107:646, 1994. REPRODUCED WITH PERMISSION OF THE COMPANY OF BIOLOGISTS LTD. http://jcs.biologists.org/content/107/2/639.full.pdf+html?sid=8e44240f0860-4bea-bef4-b79b866c0261)

11.3 Synthesis and Processing of Eukaryotic Ribosomal and Transfer RNAs

Fibrillar center (fc)

436

discussed in the legend of Figure 11.10, the fibrillar material is thought to consist of rDNA templates and nascent rRNA transcripts. In the following sections, we will examine the process by which these rRNA transcripts are synthesized.

Synthesizing the rRNA Precursor Oocytes are typically very large cells (e.g., 100 ␮m in diameter in mammals); those of amphibians are generally enormous (up to 2.5 mm in diameter). During the growth of amphibian oocytes, the amount of rDNA in the cell is greatly increased as are the number of nucleoli (Figure 11.11a). The selective amplification of rDNA is necessary to provide the large numbers of ribosomes that are required by the fertilized egg to begin embryonic development. Because these oocytes contain hundreds of nucleoli, each actively manufacturing rRNA, they are ideal subjects for the study of rRNA synthesis and processing. Our understanding of rRNA synthesis (and DNA transcription in general) was greatly advanced in the late 1960s with the development of techniques by Oscar Miller, Jr. of the

University of Virginia to visualize “genes in action” with the electron microscope. To carry out these studies, the fibrillar cores of oocyte nucleoli were gently dispersed to reveal the presence of a large circular fiber. When one of these fibers was examined in the electron microscope, it was seen to resemble a chain of Christmas trees (Figure 11.11b,c). The electron micrographs of Figure 11.11 and the interpretive drawings of Figure 11.12 reveal a number of aspects of nucleolar activity and rRNA synthesis. 1. The micrograph in Figure 11.11b shows numerous genes

for ribosomal RNA situated one after the other along a single DNA molecule, thus revealing the tandem arrangement of the repeated rRNA genes. 2. The micrograph in Figure 11.11b shows a static image of dynamic events that occur in the nucleolus. We can interpret this photograph to provide a great deal of information about the process of rRNA transcription. Each of the 100 or so fibrils emerging from the DNA as a branch of a Christmas tree is a nascent rRNA transcript caught in the

Chapter 11 Gene Expression: From Transcription to Translation

Nucleoli

(b)

(a)

Figure 11.11 The synthesis of ribosomal RNA. (a) Light micrograph of an isolated nucleus from a Xenopus oocyte stained to reveal the hundreds of nucleoli. (b) Electron micrograph of a segment of DNA isolated from one of the nucleoli from a Xenopus oocyte. The DNA (called rDNA) contains the genes that encode the two large ribosomal RNAs, which are carved from a single primary transcript. Numerous genes are shown, each in the process of being transcribed. Transcription is evident by the fibrils that are attached to the DNA. These fibrils consist of nascent RNA and associated protein. The stretches of DNA between the transcribed genes are nontranscribed spacers. The arrows indicate sites where transcription is initiated. (c) A closer view of two nucleolar genes being transcribed. The length of the nascent rRNA primary transcript increases with increasing distance from the point of initiation. The RNA polymerase molecules at the base of each fibril can be seen as dots. (A: FROM DONALD D. BROWN AND IGOR B. DAWID, SCIENCE 160:272, 1968; © 1968, REPRINTED WITH PERMISSION FROM AAAS. B–C: COURTESY OF OSCAR L. MILLER, JR., AND BARBARA R. BEATTY.)

(c)

437

1kb

18S

5.8S

28S

Frog Nontranscribed spacer Gene promoter

Terminators

Mouse 25kb

18S

5.8S

28S

DNA that are transcribed, but whose corresponding RNAs are degraded during processing, are shown in yellow. The nontranscribed spacer, which lies between the transcription units, contains the promoter region at the 5⬘ side of the gene. (AFTER B. SOLLNER-WEBB AND E. B. MOUGEY, TRENDS BIOCHEM. SCI. 16:59, 1991.)

act of elongation. The dark granule at the base of each fibril, visible in the higher magnification photograph of Figure 11.11c, is the RNA polymerase I molecule synthesizing that transcript. The length of the fibrils increases gradually from one end of the Christmas tree trunk to the other. The shorter fibrils are RNA molecules of fewer nucleotides that are attached to polymerase molecules bound to the DNA closer to the transcription initiation site. The longer the fibril, the closer the transcript is to completion. The length of DNA between the shortest and longest RNA fibrils corresponds to a single transcription unit (Figure 11.12). The promoter lies just upstream from the transcription initiation site. The high density of RNA polymerase molecules along each transcription unit (about 1 every 100 base pairs of DNA) reflects the high rate of rRNA synthesis in the nucleoli of these oocytes. 3. It can be seen in Figure 11.11c that the nascent RNA transcripts contain associated particles. These particles consist of RNA and protein that work together to convert the rRNA precursors to their final rRNA products and assemble them into ribosomal subunits. These processing events occur as the RNA molecule is being synthesized. 4. It can be noted from Figure 11.11b that the region of the DNA fiber between adjacent transcription units is devoid of nascent RNA chains. Because this region of the ribosomal gene cluster is not transcribed, it is referred to as the nontranscribed spacer (Figure 11.12). Nontranscribed spacers are present between various types of tandemly repeated genes, including those of tRNAs and histones.

cule.1 Three of these rRNAs (the 28S, 18S, and 5.8S) are carved by various nucleases from a single primary transcript (called the pre-rRNA). The 5S rRNA is synthesized from a separate RNA precursor outside the nucleolus. We will begin this discussion with the pre-rRNA. Two of the peculiarities of the pre-rRNA, as compared with other RNA transcripts, are the large number of methylated nucleotides and pseudouridine residues. By the time a human pre-rRNA precursor is first cleaved, over 100 methyl groups have been added to ribose groups in the molecule, and approximately 95 of its uridine residues have been enzymatically converted to pseudouridine (see Figure 11.15a). All of these modifications occur after the nucleotides are incorporated into the nascent RNA, that is, posttranscriptionally. The altered nucleotides are located at specific positions and are clustered within portions of the molecule that have been conserved among organisms. All of the nucleotides in the prerRNA that are altered remain as part of the final products, while unaltered sections are discarded during processing. The functions of the methyl groups and pseudouridines are unclear. These modified nucleotides may protect parts of the pre-rRNA from enzymatic cleavage, promote folding of rRNAs into their final three-dimensional structures, and/or promote interactions of rRNAs with other molecules. Because rRNAs are so heavily methylated, their synthesis can be followed by incubating cells with radioactively labeled methionine, a compound that serves in most cells as a methyl group donor. The methyl group is transferred enzymatically from methionine to nucleotides in the pre-rRNA. When [14C] methionine is provided to cultured mammalian cells for

Processing the rRNA Precursor

1

Eukaryotic ribosomes have four distinct ribosomal RNAs, three in the large subunit and one in the small subunit. In humans, the large subunit contains a 28S, 5.8S, and 5S RNA molecule, and the small subunit contains an 18S RNA mole-

The S value (or Svedberg unit) refers to the sedimentation coefficient of the RNA; the larger the number, the more rapidly the molecule moves through a field of force during centrifugation, and (for a group of chemically similar molecules) the larger the size of the molecule. The 28S, 18S, 5.8S, and 5S RNAs consist of nucleotide lengths of about 5000, 2000, 160, and 120 nucleotides, respectively.

11.3 Synthesis and Processing of Eukaryotic Ribosomal and Transfer RNAs

Figure 11.12 The rRNA transcription unit. The top drawing depicts the appearance of a portion of the DNA from a nucleolus as it is transcribed into rRNA. The lower drawings illustrate one of the transcription units that codes for rRNA in Xenopus and the mouse. Those parts of the DNA that encode the mature rRNA products are shown in green. Regions of transcribed spacer, that is, portions of the

438

a short period of time, a considerable fraction of the incorporated radioactivity is present in a 45S RNA molecule, equal to a length of about 13,000 nucleotides. The 45S RNA is cleaved into smaller molecules that are then trimmed down to the 28S, 18S, and 5.8S rRNA molecules. The combined length of the three mature rRNAs is approximately 7000 nucleotides, or a little more than half of the primary transcript. Some of the steps in the processing pathway from the 45S pre-rRNA to the mature rRNAs can be followed by incubating mammalian cells very briefly with labeled methionine and then chasing the cells in nonlabeled medium for varying lengths of time (Figure 11.13). (This type of “pulse-chase” experiment was discussed on page 273). As noted above, the first species to become labeled in this type of experiment is the 45S primary transcript, which is seen as a peak of radioactivity (red line) in the nucleolar RNA fraction after 10 minutes. After about an hour, the 45S RNA has disappeared from the nucleolus and is largely replaced by a 32S RNA, which is one of the two major products produced from the 45S primary transcript. The 32S RNA is seen as a distinct peak in the nucleolar fraction from 40

Nucleolus 10 min

minutes to 150 minutes. The 32S RNA is a precursor to the mature 28S and 5.8S rRNAs. The other major product of the 45S pre-rRNA leaves the nucleolus quite rapidly and appears in the cytoplasm as the mature 18S rRNA (seen in the 40minute cytoplasmic fraction). After a period of two or more hours, nearly all of the radioactivity has left the nucleolus, and most has accumulated in the 28S and 18S rRNAs of the cytoplasm. The radioactivity in the 4S RNA peak includes the 5.8S rRNA as well as methyl groups that have been transferred to small tRNA molecules. Figure 11.14 shows a likely pathway in the processing of an rRNA primary transcript. The Role of the snoRNAs The processing of the prerRNA is accomplished with the help of a large number of small, nucleolar RNAs (or snoRNAs) that are packaged with particular proteins to form particles called snoRNPs (small, nucleolar ribonucleoproteins). Electron micrographs indicate that snoRNPs begin to associate with the rRNA precursor before it is fully transcribed. The first RNP particle to attach to a pre-rRNA transcript contains the U3 snoRNA and

40 min

80 min

150 min 32S

32S

32S

400

32S

4.0 300

)

200 45S

2.0

45S

)

45S

100

800

Cytoplasm 10 min

40 min

80 min

28S

28S

Optical density (

C-methionine (cts/min) ( 14

Chapter 11 Gene Expression: From Transcription to Translation

45S

150 min

400

300 28S

28S 18S

200

18S

18S

6.0

18S 4S

4S

4S

4S

100

4.0

2.0

10

20

10

20

10

20

10

20

Fraction number

Figure 11.13 Kinetic analysis of rRNA synthesis and processing. A culture of mammalian cells was incubated for 10 minutes in [14C]methionine and was then chased in unlabeled medium for various times as indicated in each panel. After the chase, cells were washed free of isotope and homogenized, and nucleolar and cytoplasmic fractions were prepared. The RNA was extracted from each fraction and analyzed by sucrose density-gradient sedimentation. As discussed in Section 18.9, this technique separates RNAs according to size (the larger the size, the closer the RNA is to the bottom of the tube, which corresponds to fraction 1). The continuous blue line represents the UV absorbance of each cellular fraction, which provides a measure of the amount of RNA of each size class. This absorbance profile does not change with time.

The solid red line shows the radioactivity at various times during the chase. The graphs of nucleolar RNA (upper profiles) show the synthesis of the 45S rRNA precursor and its subsequent conversion to a 32S molecule, which is a precursor to the 28S and 5.8S rRNAs. The other major product of the 45S precursor leaves the nucleus very rapidly and, therefore, does not appear prominently in the nucleolar RNA. The lower profiles show the time course of the appearance of the mature rRNA molecules in the cytoplasm. The 18S rRNA appears in the cytoplasm well in advance of the larger 28S species, which correlates with the rapid exodus of the former from the nucleolus. (FROM H. GREENBERG AND S. PENMAN, J. MOL. BIOL. 21:531, 1966; COPYRIGHT 1966, BY PERMISSION OF THE PUBLISHER ACADEMIC PRESS.)

439

RNA polymerase I 5' 3' 1 5S

2

3

4

5

5'

3' 45S 41S 2

3

18S 32S

18S

28S 5.8S

28S 5.8S

Figure 11.14 A proposed scheme for the processing of mammalian ribosomal RNA. The primary transcript for the rRNAs is a 45S molecule of about 13,000 bases. The principal cleavage events during processing of this pre-rRNA are indicated by the boxed numbers. Cleavage of the primary transcript at sites 1 and 5 removes the 5⬘ and 3⬘ external transcribed sequences and produces a 41S intermediate. The second cleavages can occur at either site 2 or 3 depending on the type of cell. Cleavage at site 3 generates the 32S intermediate seen in curves of the previous figure. During the final processing steps, the 28S and 5.8S sections are separated from one another, and the ends of the various intermediates are trimmed down to their final mature size. (AFTER R. P. PERRY, J. CELL BIOL. 91:29S, 1981; BY COPYRIGHT PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

transcript (Figure 11.14). Some of the other enzymatic cleavages indicated in Figure 11.14 are thought to be catalyzed by the “exosome,” which is an RNA degrading machine that consists of nearly a dozen different exonucleases. U3 and several other snoRNAs were identified years ago because they are present in large quantities (about 106 copies per cell). Subsequently, a different class of snoRNAs present at lower concentration (about 104 copies per cell) was discovered. These low-abundance snoRNAs can be divided into two groups based on their function and similarities in nucleotide sequence. Members of one group (called the box C /D snoRNAs) determine which nucleotides in the pre-rRNA will have their ribose moieties methylated, whereas members of the other group (called the box H/ACA snoRNAs) determine which uridines are converted to pseudouridines. The structures of nucleotides modified by these two reactions are shown in Figure 11.15a. O 1

O

HN

2

6

5

3

NH 4

O

O 5 6

O 5'

3

4

NH

CH2 O

2

1

O

N

CH2 O

OH

OH 1'

4' 3'

Pseudouridine

2'

OH

OH

Base

CH2 O

more than two dozen different proteins. This huge component of the rRNA-processing machinery can be seen as a ball bound to the outer end of each nascent RNA fibril in Figure 11.11c, where it catalyzes the removal of the 5⬘ end of the

OH

(a)

CH3

2'_ O_ methylated ribose

5'

Figure 11.15 Modifying the pre-rRNA. (a) The most frequent modifications to nucleotides in a pre-rRNA are the conversion (isomerization) of a uridine to a pseudouridine and the methylation of a ribose at the 2⬘ site of the sugar. To convert uridine to pseudouridine, the N1—C1⬘ bond is cleaved, and the uracil ring is rotated through 120⬚, which brings the C5 of the ring into place to form the new bond with C1⬘ of ribose. These chemical modifications are catalyzed by a protein component of the snoRNPs called dyskerin. (b) The formation of an RNA–RNA duplex between the U20 snoRNA and a portion of the pre-rRNA that leads to 2⬘ ribose methylation. In each case, the methylated nucleotide in the rRNA is hydrogen bonded to a nucleotide of the snoRNA that is located five base pairs from box D. Box D, which contains the invariable sequence CUGA, is present in all snoRNAs that guide ribose methylation. (c) The formation of an RNA–RNA duplex between U68 snoRNA and a portion of the pre-rRNA that leads to conversion of a uridine to pseudouridine (␺). Pseudouridylation occurs at a fixed site relative to a hairpin fold in the snoRNA. The snoRNAs that guide pseudouridylation have a common ACA sequence. (B: AFTER J.-P. BACHELLERIE AND J. CAVAILLÉ, TRENDS IN BIOCH. SCI. 22:258, 1997; COPYRIGHT 1997, WITH PERMISSION FROM ELSEVIER SCIENCE; C: AFTER P. GANOT ET AL., CELL 89:802, 1997.)

O

3'

pre-rRNA

CH3

A A CU UG A CU A UC U A G A GG A A U

Box D

C U U GA A C U GA U A G A UC U CC U U

G

n=5

A

5' U20 snoRNA

3'

(b) U68 snoRNA 5'

U C U G U C G C

A G U G A A U (N)11 ACA

G C A GC G A Ψ A C U U G 3'

(c)

pre rRNA

5'

3'

11.3 Synthesis and Processing of Eukaryotic Ribosomal and Transfer RNAs

O

Uridine

440

Both groups of snoRNAs contain relatively long stretches (10 to 21 nucleotides) that are complementary to sections of the rRNA transcript. These snoRNAs provide an excellent example of the principle that single-stranded nucleic acids having complementary nucleotide sequences are capable of forming double-stranded hybrids. In this case, each snoRNA binds to a specific portion of the pre-rRNA to form an RNA–RNA duplex. The bound snoRNA then guides an enzyme—either a methylase or a pseudouridylase—within the snoRNP to modify a particular nucleotide in the pre-rRNA. Taken together, there are roughly 200 different snoRNAs, one for each site in the pre-rRNA that is ribose-methylated or pseudouridylated. If the gene encoding one of these snoRNAs is deleted, one of the nucleotides of the pre-rRNA fails to be enzymatically modified. The mechanism of action of the two types of snoRNAs is illustrated in Figure 11.15b,c. The nucleolus is the site not only of rRNA processing, but also of assembly of the two ribosomal subunits. More than 200 different proteins have been found to be associated with the rRNAs at different stages of ribosome assembly. These proteins can be divided into two groups: ribosomal proteins that remain in the subunits, and accessory proteins that have a transient interaction with the rRNA intermediates and are required only for processing. Among this latter group are more than a dozen RNA helicases, which are enzymes that unwind regions of double-stranded RNA. These enzymes are presumably involved in the many structural rearrangements that occur during ribosome formation, including the dissociation of snoRNAs from pre-ribosomal particles.

Chapter 11 Gene Expression: From Transcription to Translation

Synthesis and Processing of the 5S rRNA A 5S rRNA, about 120 nucleotides long, is present as part of the large ribosomal subunit of both prokaryotes and eukaryotes. In eukaryotes, the 5S rRNA molecules are encoded by a large number of identical genes that are separate from the other rRNA genes and are located outside the nucleolus. Following its synthesis, the 5S rRNA is transported to the nucleolus to join the other components involved in the assembly of ribosomal subunits. The 5S rRNA genes are transcribed by RNA polymerase III. RNA polymerase III is unusual among the three polymerases in that it can bind to a promoter site located within the transcribed portion of the target gene.2 The internal position of the promoter was first clearly demonstrated by introducing modified 5S rRNA genes into host cells and determining the ability of that DNA to serve as templates for the host polymerase III. It was found that the entire 5⬘ flanking region could be removed and the polymerase would still transcribe the DNA starting at the normal initiation site. If, however, the deletion included the central part of the gene, the polymerase would not transcribe the DNA or even bind to it. If the internal promoter from a 5S rRNA gene is inserted 2

RNA polymerase III transcribes several different RNAs. It binds to an internal promoter when transcribing a pre-5S RNA or pre-tRNA, but binds to an upstream promoter when transcribing the precursors for several others, including U6 snRNA.

1 kb Tyr Phe

Met-A

2 kb

3 kb

Met-B

Leu Asn

Lys

Ala

Figure 11.16 The arrangement of genes that code for transfer RNAs in Xenopus. A 3.18-kilobase fragment of DNA, showing the arrangement of various tRNA genes and spacers. (FROM S. G. CLARKSON ET AL., IN D. D. BROWN, ED., DEVELOPMENTAL BIOLOGY USING PURIFIED GENES, ACADEMIC PRESS, 1981.)

into another region of the genome, the new site becomes a template for transcription by RNA polymerase III.

Transfer RNAs Plant and animal cells are estimated to have approximately 50 different species of transfer RNA, each encoded by a repeated DNA sequence. The degree of repetition varies with the organism: yeast cells have a total of about 275 tRNA genes, fruit flies about 850, and humans about 1300. Transfer RNAs are synthesized from genes that are found in small clusters scattered around the genome. A single cluster typically contains multiple copies of different tRNA genes, and conversely, the DNA sequence encoding a given tRNA is typically found in more than one cluster. The DNA within a cluster (tDNA) consists largely of nontranscribed spacer sequences with the tRNA coding sequences situated at irregular intervals in a tandemly repeated arrangement (Figure 11.16). Like the 5S rRNA, tRNAs are transcribed by RNA polymerase III and the promoter sequence lies within the coding section of the gene rather than being located at its 5⬘ flank. The primary transcript of a transfer RNA molecule is larger than the final product, and pieces on both the 5⬘ and 3⬘ sides of the precursor tRNA (and a small interior piece in some cases) must be trimmed away. In addition, numerous bases must be modified (see Figure 11.42). One of the enzymes involved in pre-tRNA processing is an endonuclease called ribonuclease P, which is present in both bacterial and eukaryotic cells and consists of both RNA and protein subunits. It is the RNA subunit of ribonuclease P that catalyzes cleavage of the pre-tRNA substrate, a subject discussed in the Experimental Pathways, on page 478.

REVIEW 1. Describe the differences between a primary transcript, a transcription unit, a transcribed spacer, and a mature rRNA. 2. Draw a representation of an electron micrograph of rDNA being transcribed. Label the nontranscribed spacer, the transcribed spacer, the RNA polymerase molecules, the U3 snRNP, and the promoter. 3. Compare the organization of the genes that code for the large rRNAs and the tRNAs within the vertebrate genome.

441

11.4 | Synthesis and Processing of Eukaryotic Messenger RNAS When eukaryotic cells are incubated for a short period in [3H]uridine or [32P]phosphate and immediately killed, most of the radioactivity is incorporated into RNA molecules that have the following properties: (1) they have large molecular weights (up to about 80S, or 50,000 nucleotides); (2) as a group, they are represented by RNAs of diverse (heterogeneous) nucleotide sequence; and (3) they are found only in the nucleus. Because of these properties, these RNAs are referred to as heterogeneous nuclear RNAs (hnRNAs), and they are indicated by the red radioactivity line in Figure 11.17a. When cells that

28S

2.0

1.0

OD

0

10

OD 260 nm ( )

20

30

40

Fraction number

(a)

5.0

3.0

Radioactivity

18S 0

10

20

30

40

Fraction number

Figure 11.17 The formation of heterogeneous nuclear RNA (hnRNA) and its conversion into smaller mRNAs. (a) Curves showing the sedimentation pattern of total RNA extracted from duck blood cells after exposure to [32P]phosphate for 30 minutes. The larger the RNA, the farther it travels during centrifugation and the closer to the bottom of the tube it ultimately lies. The absorbance (blue line) indicates the total amount of RNA in different regions of the centrifuge tube, whereas the red line indicates the corresponding radioactivity. It is evident that most of the newly synthesized RNA is very large, much larger than the stable 18S and 28S rRNAs. These large RNAs are the hnRNAs. (b) The absorbance and radioactivity profiles of the RNA were extracted from cells that were pulse-labeled for 30 minutes as in part a, but then chased for 3 hours in the presence of actinomycin D, which prevents the synthesis of additional RNA. It is evident that the large hnRNAs have been processed into smaller RNA products. (FROM G. ATTARDI ET AL., J. MOL. BIOL. 20:160, 1966; COPYRIGHT 1966, BY PERMISSION OF THE PUBLISHER ACADEMIC PRESS.)

The Machinery for mRNA Transcription All eukaryotic mRNA precursors are synthesized by RNA polymerase II, an enzyme composed of a dozen different subunits that is remarkably conserved from yeast to mammals. RNA polymerase II binds the promoter with the cooperation of a number of general transcription factors (GTFs) to form a preinitiation complex (PIC). These proteins are referred to as general transcription factors because the same ones are required for accurate initiation of transcription of a diverse array of genes in a wide variety of different organisms. The promoter elements that nucleate PIC assembly lie to the 5⬘ side of each transcription unit, although the enzyme makes contacts with the DNA on both sides of the transcription start

11.4 Synthesis and Processing of Eukaryotic Messenger RNAS

28S

OD

1.0

(b)

Radioactivity ( )

18S Radioactivity

Radioactivity ( )

OD 260 nm ( )

3.0

have been incubated in [3H]uridine or [32P]phosphate for a brief pulse are placed into unlabeled medium, chased for several hours before they are killed and the RNA extracted, the amount of radioactivity in the large nuclear RNAs drops sharply and appears instead in much smaller mRNAs found in the cytoplasm (red line of Figure 11.17b). These early experiments, begun initially by James Darnell, Jr., Klaus Scherrer and colleagues, suggested that the large, rapidly labeled hnRNAs were primarily precursors to the smaller cytoplasmic mRNAs. This interpretation has been fully substantiated by a large body of research over the past 40 years. It is important to note that the blue and red lines in Figure 11.17 follow a very different course. The blue lines, which indicate the optical density (i.e., the absorbance of UV light) of each fraction, provide information about the amount of RNA in each fraction following centrifugation. It is evident from the blue lines that most of the RNA in the cell is present as 18S and 28S rRNA (along with various small RNAs that stay near the top of the tube). The red lines, which indicate the radioactivity in each fraction, provide information about the number of radioactive nucleotides incorporated into different-sized RNAs during the brief pulse. It is evident from these graphs that neither the hnRNA (Figure 11.17a) nor the mRNA (Figure 11.17b) constitute a significant fraction of the RNA in the cell. If they did, there would be a greater correspondence between the blue and red lines. Thus, even though the mRNAs (and their heterogeneous nuclear RNA precursors) constitute only a small percentage of the total RNA of most eukaryotic cells, they constitute a large percentage of the RNA that is being synthesized by that cell at any given moment (Figure 11.17a). The reason that there is little or no evidence of the hnRNAs and mRNAs in the optical density plots of Figure 11.17 is that these RNAs are degraded after relatively brief periods of time. This is particularly true of the hnRNAs, which are processed into mRNAs (or completely degraded) even as they are being synthesized. In contrast, the rRNAs and tRNAs have half-lives that are measured in days or weeks and thus gradually accumulate to become the predominant species in the cell. Some accumulation of radioactivity in the mature 28S and 18S rRNAs can be seen after a 3-hour chase (Figure 11.17b). The half-lives of mRNAs vary depending on the particular species, ranging from about 15 minutes to a period of days.

442

site (see Figure 11.19b). We will restrict the discussion to the best studied promoters, which are those associated with highly expressed, tissue-specific genes such as the ovalbumin gene, which encodes the white of a chicken egg, and the globin genes, which encode the polypeptides of hemoglobin. A critical portion of the promoter of such genes lies between 24 and 32 bases upstream from the site at which transcription is initiated (Figure 11.18a). This region often contains a consensus sequence that is either identical or very similar to the oligonucleotide 5⬘-TATAAA-3⬘ and is known as the TATA box. The first step in assembly of the preinitiation complex is binding of a protein, called TATA-binding protein (TBP), that recognizes the TATA box of these promoters (Figure 11.18b). Thus, as in bacterial cells, a purified eukaryotic polymerase is

Chicken ovalbumin Rabbit β– globin Mouse β– globin major

GAGGC T A T A T A T T CCCCAGGGC T CAGCCAG T G T C T G T ACA T T GGGCA T A A A AGGGAGAGCAGGGCAGC T GC T GC T A ACAC GAGCA T A T A AGG T GAGG T AGGA T CAG T T GC T CC T CACA T T

(a)

Start site Inr

TATA

3

DPE

DNA TFIID

TFIID

not able to recognize a promoter directly and cannot initiate accurate transcription on its own. TBP is present as a subunit of a much larger protein complex called TFIID.3 X-ray crystallography has revealed that binding of TBP to a polymerase II promoter causes a dramatic distortion in conformation of the DNA. As shown in Figure 11.19a, TBP inserts itself into the minor groove of the double helix, bending the DNA molecule more than 80⬚ at the site of DNA–protein interaction. While TBP binds TATA, other subunits of the TFIID complex bind to other regions of the DNA, including elements that lie downstream of the transcription start site. Binding of TFIID sets the stage for the assembly of the complete preinitiation complex, which is thought to occur in a stepwise manner as depicted in Figure 11.18b. Interaction of three of the GTFs (TBP of TFIID, TFIIA, and TFIIB) with DNA is shown in Figure 11.19a. The presence of these three GTFs bound to the promoter provides a platform for the subsequent binding of the huge, multisubunit RNA polymerase with its attached TFIIF (Figure 11.18b). Once the RNA polymerase–TFIIF is in position, another pair of GTFs (TFIIE and TFIIH) join the complex and transform the polymerase into an active, transcribing machine. A three-dimensional model of the preinitiation complex is shown in Figure 11.19b. TFIIH is the only GTF known to possess enzymatic activities. One of the subunits of TFIIH functions as a protein kinase to phosphorylate RNA polymerase (discussed below), whereas two other subunits of this protein function as DNA

Start site TAFs

TBP is actually a universal transcription factor that mediates the binding of all three eukaryotic RNA polymerases. TBP is present as one of the subunits of three different protein complexes. As a subunit of TFIID, TBP promotes the binding of RNA polymerase II. As subunits of the proteins SL1 or TFIIIB, TBP promotes the binding of RNA polymerases I and III, respectively. Polymerase I and III promoters lack a TATA box, as do many polymerase II promoters, yet all of these regions of the DNA bind TBP.

Chapter 11 Gene Expression: From Transcription to Translation

DNA TBP TFIIB TFIIA

TFIID TFIIA

TFIIB

Start site

TAFs DNA

TBP RNA polymerase II (RNAPII) _ TFIIF

TFIID TFIIF

TFIIA

Start site TAFs

DNA

TBP

RNAPII

TFIIB

TFIIE TFIIH TFIID TFIIA

TFIIE TFIIF Start site

TAFs DNA

TBP

RNAPII

TFIIB TFIIH

Preinitiation complex (PIC) (b)

Figure 11.18 Initiation of transcription from a eukaryotic polymerase II promoter. (a) Nucleotide sequence of the region just upstream from the site where transcription is initiated in three different eukaryotic genes. The TATA box is indicated by the blue shading. Many eukaryotic promoters contain a second, conserved core promoter element called the initiator (Inr), which includes the site where transcription is initiated (shown in orange). Other promoter elements are depicted in Figure 12.45. It should be noted that (1) most eukaryotic promoters lack a recognizable TATA box and (2) numerous other promoter elements, for example, the DPE shown in part b, have been identified that lie downstream of the transcription start site. Different genes contain different combinations of promoter elements so that only a subset are required to nucleate PIC assembly. (b) A highly schematic model of the steps in the assembly of the preinitiation complex for RNA polymerase II. The polymerase itself is denoted RNAPII; the other components are the various general transcription factors required in assembly of the complete complex. TFIID includes the TBP subunit, which specifically binds to the TATA box, and a number of other subunits, which collectively are called TBP-associated factors (TAFs). TFIIB is thought to provide a binding site for RNA polymerase. TFIIF, which contains a subunit homologous to the bacterial ␴ factor, is bound to the entering polymerase. TFIIH contains 10 subunits, 3 of which possess enzymatic activities.

443 TFIIE

TBP TFIIA Do wn

str

U

ps

tr

e

am

ea

m

TFIIB TFIIF

(a)

(b)

Figure 11.19 Structural models of the formation of the preinitiation complex. (a) Model of the complex formed by DNA and three of the GTFs, TBP of TFIID, TFIIA, and TFIIB. Interaction between the TATA box and TBP bends the DNA approximately 80⬚ and allows TFIIB to bind to the DNA both upstream and downstream of the TATA box. (b) Top and bottom view of a model of the preinitiation complex. Unlike the schematic model of Figure 11.18b, DNA (shown in white) is thought to wrap around the

preinitiation complex so that GTFs can contact the DNA on both sides of the transcription start site. (A: FROM GOURISANKAR GHOSH AND GREGORY D. VAN DUYNE, STRUCTURE 4:893, 1996, FIG. 2, WITH PERMISSION FROM ELSEVIER. B: FROM M. DOUZIECH ET AL., MOL. CELL BIOL. 20:8175, 2000, FIG. 7A. COURTESY OF BENOIT COULOMBE. REPRODUCED WITH PERMISSION FROM AMERICAN SOCIETY FOR MICROBIOLOGY.)

unwinding enzymes (helicases). DNA helicase activity is required to separate the DNA strands of the promoter, allowing the template strand to find its way into the active site of the polymerase. Once transcription begins, certain of the GTFs (including TFIID) may be left behind at the promoter, while others are released from the complex (Figure 11.20). As long as TFIID remains bound to the promoter, additional RNA polymerase molecules may be able to attach to the promoter site and rapidly initiate additional rounds of transcription. The carboxyl-terminal domain (CTD) of the largest subunit of RNA polymerase II has an unusual structure; it consists of a sequence of seven amino acids (-Tyr1-Ser2-Pro3Thr4-Ser5-Pro6-Ser7-) that is repeated over and over. In humans, the CTD consists of 52 repeats of this heptapeptide. All seven residues of the heptapeptide can be enzymatically modified in one way or another; we will limit the discussion to serines two and five, which are prime candidates for phosphorylation by protein kinases. The RNA polymerase that assembles into the preinitiation complex is not phosphorylated, whereas the same enzyme engaged in transcription is heavily phosphorylated; all of the added phosphate groups are localized in the CTD (Figure 11.20). CTD phosphorylation can be catalyzed by at least four different protein kinases, including TFIIH, which phosphorylates the serine residues at position #5. Phosphorylation of the polymerase by TFIIH may act as the trigger that uncouples the enzyme from the GTFs and/or the promoter DNA, allowing the enzyme to escape from the preinitiation complex and move down the DNA template. At the earliest stages of transcription, the CTD is phosphorylated on the Ser5 positions, which provide binding sites for proteins involved in the earliest stages of mRNA processing, such as 5⬘ cap formation (see Figure 11.34). As the RNA polymerase moves along the gene being transcribed, another kinase (P-TEFb) phosphorylates the CTD on the serine residues at position #2 (as in Figure 11.34). This change in phosphorylation pattern is thought to facitilate the recruitment of additional protein factors involved in RNA splicing and addition of a poly(A) tail, as disucussed below. In this way,

the CTD acts as a platform for the dynamic gain and loss of factors required for the formation of a mature mRNA. According to some estimates, an elongating RNA polymerase II may con-

TFIIE

TFIID TFIIF

TFIIA

Start site

TAFs DNA TBP

RNAPII

TFIIB TFIIH

CTD

TFIIF ELL RNAPII

TAFs DNA TBP P P

21

5 76

21 43

5 76

43

76 21

1 P 32 54

CTD

TFIIS Nascent RNA P-TEFb

Figure 11.20 Initiation of transcription by RNA polymerase II is associated with phosphorylation of the C-terminal domain (CTD). The initiation of transcription is associated with TFIIH-catalyzed phosphorylation of serine residues at the 5 position of each heptad repeat of the CTD. Phosphorylation is thought to provide the trigger for the separation of the transcriptional machinery from the general transcription factors and/or the promoter DNA. TFIIS, ELL, and P-TEFb are three of a number of elongation factors that may become associated with the polymerase as it moves along the DNA. TFIIS helps the polymerase get moving again after it pauses, whereas P-TEFb is a kinase that phosphorylates the Ser2 residues of the CTD after elongation begins. Phosphorylation of Ser2 residues is thought to promote the recruitment of RNA splicing and polyadenylation factors, whose activities are discussed in the following sections (see Figure 11.34).

11.4 Synthesis and Processing of Eukaryotic Messenger RNAS

TFIID

444

tain over 50 components and constitute a total mass of more than 3 million daltons. Termination of transcription by RNA polymerase II is not well understood. There is no evidence that the DNA of protein-coding genes contains well-defined termination sequences as in bacterial cells. In fact, a transcribing RNA polymerase II can travel a variable and extensive distance past the point that will ultimately give rise to the 3⬘ terminus of the processed mRNA. Thus unlike bacteria, where the 3⬘ end of the mRNA is generated simply by transcription termination, formation of the 3⬘ end of a eukaryotic mRNA is determined by a separate series of processing steps (page 448). Together, RNA polymerase II and its GTFs are sufficient to promote a low, basal level of transcription from most promoters under in vitro conditions. As will be discussed at length in Chapter 12, a variety of specific transcription factors are able to bind at numerous sites in the regulatory regions of the DNA. These specific transcription factors can determine (1) whether or not a preinitiation complex assembles at a particular promoter and/or (2) the rate at which the polymerase initiates new rounds of transcription from that promoter. Before discussing the pathway by which mRNAs are produced, let us first describe the structure of mRNAs so that the reasons for some of the processing steps will be clear. The Structure of mRNAs properties:

Messenger RNAs share certain

1. They contain a continuous sequence of nucleotides

Chapter 11 Gene Expression: From Transcription to Translation

2. 3. 4.

5.

encoding a specific polypeptide. They are found in the cytoplasm. They are attached to ribosomes when they are translated. Most mRNAs contain a significant noncoding segment, that is, a portion that does not direct the assembly of amino acids. For example, approximately 25 percent of each globin mRNA consists of noncoding, nontranslated regions (Figure 11.21). Noncoding portions are found at both the 5⬘ and 3⬘ ends of a messenger RNA and contain sequences that have important regulatory roles (Section 12.6). Eukaryotic mRNAs have special modifications at their 5⬘ and 3⬘ termini that are not found on either bacterial mRNAs or on tRNAs or rRNAs. The 3⬘ end of nearly all eukaryotic mRNAs has a string of 50 to 250 adenosine residues that form a poly(A) tail, whereas the 5⬘ end has a methylated guanosine cap (Figure 11.21).

We will return shortly to describe how mRNAs are provided with their specialized 5⬘ and 3⬘ termini. First, however, it is necessary to take a short detour to understand how mRNAs are formed in the cell.

Split Genes: An Unexpected Finding Almost as soon as hnRNAs were discovered, it was proposed that this group of rapidly labeled nuclear RNAs were precursors to cytoplasmic mRNAs (page 441). The major sticking point was the difference in size between the two RNA populations: the hnRNAs were several times the size of the

Codes for polypeptide 146 amino acids long 5' Cap

50

Poly(A) tail

438

132

~250

Coding region 5' Untranslated region (5' UTR)

O– N+

N H2N

N

3' Untranslated region (3' UTR)

CH3

2'–O–Methylribonucleoside

N 2'

3' 5' OH CH2

OH 1'

O

4'

O

O O

P O–

O

P O–

O O

P

O

5' CH2

O–

Base O

4'

1'

3'

7–Methylguanosine

2' O

O

P

OCH3 O

CH2

Base O

O– Figure 11.21 Structure of the human ␤-globin mRNA. The mRNA contains a 5⬘ methylguanosine cap, a 5⬘ and 3⬘ noncoding O OH region that flanks the coding segment, and a P 3⬘ poly(A) tail. The lengths of each segment are mRNA given in numbers of nucleotides. The length of the poly(A) tail is variable. It typically begins at a length of about 250 nucleotides and is gradually reduced in length, as shown in Figure 12.62. The structure of the 5⬘ cap is shown.

mRNAs (Figure 11.22). Why would cells synthesize huge molecules that were precursors to much smaller versions? Early studies on the processing of ribosomal RNA had shown that mature RNAs were carved from larger precursors. Recall that large segments were removed from both the 5⬘ and 3⬘ sides of various rRNA intermediates (Figure 11.14) to yield the final, mature rRNA products. It was thought that a similar pathway might account for the processing of hnRNAs to mRNAs. But mRNAs constitute such a diverse population that it was impossible to follow the steps in the processing of a single mRNA species as had been attempted for the rRNAs. The problem was solved by an unexpected discovery. Until 1977, molecular biologists assumed that a continuous linear sequence of nucleotides in a messenger RNA is complementary to a continuous sequence of nucleotides in one strand of the DNA of a gene. Then in that year, a remarkable new finding was made by Phillip Sharp, Susan Berget, and their colleagues at MIT and Richard Roberts, Louise Chow, and their colleagues at the Cold Spring Harbor Laboratories in New York. These groups found that the mRNAs they were studying were transcribed from segments of DNA that were separated from one another along the template strand. The first observations of major importance were made during the analysis of transcription of the adenovirus genome. Adenovirus is a pathogen capable of infecting a variety of mammalian cells. It was found that a number of different ade-

445 8

Distribution of RNA mass (%)

7

(a)

hnRNA mRNA (X 0.5)

28S

6

45S

18S

5

70S 4

3

2

1

30

24

(c)

18

12

5

2

0.5

Molecular size (kb)

Figure 11.22 The difference in size between hnRNAs and mRNAs. (a,b) Electron micrographs of metal-shadowed preparations of poly(A)-mRNA (a) and poly(A)-hnRNA (b) molecules. Representative size classes of each type are shown. Reference molecule is ␾X-174 viral DNA. (c) Size distribution of hnRNA and mRNA from mouse L cells as determined by density-gradient sedimentation. The red line represents rapidly labeled hnRNA, whereas the purple line represents mRNA that was isolated from polyribosomes after a 4-hour labeling period. The abscissa has been converted from fraction number (indicated by the points) to molecular size by calibration of the gradients. (FROM JOHN A. BANTLE AND WILLIAM E. HAHN, CELL 8:145, 1976. REPRODUCED WITH PERMISSION FROM ELSEVIER.)

(b)

template DNA. Instead, the leader is transcribed from three distinct and separate segments of DNA (represented by blocks x, y, and z in the top line of Figure 11.23). The regions of DNA between these blocks, referred to as intervening sequences (I1 to I3 in Figure 11.23), are somehow missing in

x

I1

y

I2

z

I3

Hexon gene

x

I1

y

I2

z

I3

Hexon

DNA

Primary transcript

5'ppp

3'

x Processing intermediate

y

z

Hexon

7mGppp

poly A I1 I2 I3

x y z Mature mRNA

7mGppp

Hexon poly A

Figure 11.23 The discovery of intervening sequences (introns). A portion of the adenovirus genome is shown at the top. The noncontinuous sequence blocks labeled x, y, and z appear in a continuous arrangement in the mature mRNAs that code for a variety of polypeptides, such as the hexon protein. As discussed later in the text, the conversion of the primary transcript to the mRNA involves the removal (excision) of the intervening sequences, or introns (I1 to I3), and ligation of the remaining portions to produce a continuous RNA molecule (bottom). The steps by which this occurs are shown in Figure 11.32.

11.4 Synthesis and Processing of Eukaryotic Messenger RNAS

novirus mRNAs had the same 150- to 200-nucleotide 5⬘ terminus. One might expect that this leader sequence represents a repeated stretch of nucleotides located near the promoter region of each of the genes for these mRNAs. However, further analysis revealed that the 5⬘ leader sequence is not complementary to a repeated sequence and, moreover, is not even complementary to a continuous stretch of nucleotides in the

446

the corresponding mRNA. It could have been argued that the presence of intervening sequences is a peculiarity of viral genomes, but the basic observation was soon extended to cellular genes themselves. The presence of intervening sequences in nonviral, cellular genes was first reported in 1977 by Alec Jeffreys and Richard Flavell in The Netherlands and Pierre Chambon in France. Jeffreys and Flavell discovered an intervening sequence of approximately 600 bases located directly within a part of the globin gene that coded for the amino acid sequence of the globin polypeptide (Figure 11.24). The basis for this finding is discussed in the legend accompanying the figure. Intervening sequences were soon found in other genes, and it became apparent that the presence of genes with intervening sequences—called split genes—is the rule not the exception. Those parts of a split gene that contribute to the mature RNA product are called exons, whereas the intervening sequences are called introns. Split genes are widespread among eukaryotes, although the introns of simpler eukaryotes (e.g., yeast

Bg H E Bg P

H

B

Intron

E

Rabbit genomic DNA

H

B

Chapter 11 Gene Expression: From Transcription to Translation

E H

-1

0

cDNA Bg 1

kb

Figure 11.24 The discovery of introns in a eukaryotic gene. As discussed in Chapter 18, bacteria contain restriction enzymes that recognize and cleave DNA molecules at the site of certain nucleotide sequences. The drawing shows a map of restriction enzyme cleavage sites in the region of the rabbit ␤-globin gene (upper) and the corresponding map of a cDNA prepared from the ␤-globin mRNA (lower). (A cDNA is a DNA made in vitro by reverse transcriptase using the mRNA as a template. Thus, the cDNA has the complementary sequence to the mRNA. cDNAs had to be used for this experiment because restriction enzymes don’t cleave RNAs.) The letters indicate the sites at which various restriction enzymes cleave the two DNAs. The upper map shows that the globin gene contains a restriction site for the enzyme BamH1 (B) located approximately 700 base pairs from a restriction site for the enzyme EcoR1 (E). When the globin cDNA was treated with these same enzymes (lower map), the corresponding B and E sites were located only 67 nucleotides apart. It is evident that the DNA prepared from the genome has a sizeable region that is absent from the corresponding cDNA (and thus absent from the mRNA from which the cDNA was produced). Complete sequencing of the globin gene later showed it to contain a second smaller intron. (FROM A. J. JEFFREYS AND R. A. FLAVELL, CELL 12:1103, 1977; BY PERMISSION FROM CELL PRESS.)

and nematodes) tend to be fewer in number and smaller in size than those of more complex plants and animals. Introns are found in all types of genes, including those that encode tRNAs, rRNAs, as well as mRNAs. The discovery of genes with introns immediately raised the question as to how such genes were able to produce messenger RNAs lacking these sequences. One likely possibility was that cells produce a primary transcript that corresponds to the entire transcription unit, and that those portions of the RNA corresponding to the introns in the DNA are somehow removed. If this were the case, then the segments corresponding to the introns should be present in the primary transcript. Such an explanation would also provide a reason why hnRNA molecules are so much larger than the mRNA molecules they ultimately produce. Research on nuclear RNA had proceeded by this time to the point where the size of a few mRNA precursors (premRNAs) had been determined. The globin sequence, for example, was found to be present within a nuclear RNA molecule that sediments at 15S, unlike the final globin mRNA, which has a sedimentation coefficient of 10S. An ingenious technique (known as R-loop formation) was employed by Shirley Tilghman, Philip Leder, and their co-workers at the National Institutes of Health to determine the physical relationship between the 15S and 10S globin RNAs and provide information on the transcription of split genes. Recall from page 400, that single-stranded, complementary DNA strands can bind specifically to one another. Singlestranded DNA and RNA molecules can also bind to one another as long as their nucleotide sequences are complementary; this is the basis of the technique of DNA–RNA hybridization discussed in Section 18.10 (the DNA–RNA complex is called a hybrid). Tilghman and her co-workers used the electron microscope to examine a fragment of DNA containing the globin gene that had been hybridized to the 15S globin RNA. The hybrid was seen to consist of a continuous, double-stranded, DNA–RNA complex (the red dotted and blue lines of the inset to Figure 11.25a). In contrast, when the same DNA fragment was incubated with the mature 10S globin mRNA, a large segment of the DNA in the center of the coding region was seen to bulge out to form a double-stranded loop (Figure 11.25b). The loop resulted from a large intron in the DNA that was not complementary to any part of the smaller globin message. It was apparent that the 15S RNA does indeed contain segments corresponding to the introns of the genes that are removed during formation of the 10S mRNA. At about the same time, a similar type of hybridization experiment was performed between the DNA that codes for ovalbumin, a protein found in hen’s eggs, and its corresponding mRNA. The ovalbumin DNA and mRNA hybrid contains seven distinct loops corresponding to seven introns (Figure 11.26). Taken together, the introns account for about three times as much DNA as that present in the eight combined coding portions (exons). Subsequent studies have revealed that individual exons average about 165 nucleotides. In contrast, individual introns average more than 3500 nucleotides, which is why hnRNA molecules are much longer than mRNAs. To cite two extreme cases, the human dys-

447 DNA-RNA hybrid

Displaced single-stranded DNA

Displaced single-stranded DNA Double-stranded DNA of intron

DNA-RNA hybrid

(a)

Displaced single-stranded DNA

(b)

Figure 11.26 Visualizing introns in the ovalbumin gene. Electron micrograph of a hybrid formed between ovalbumin mRNA and a fragment of genomic chicken DNA containing the ovalbumin gene. The hybrid shown in this micrograph is similar in nature to that of Figure 11.25b. In both cases, the DNA contains the entire gene sequence because it was isolated directly from the genome. In contrast, the RNA

has been completely processed, and those portions that were transcribed from the introns have been removed. When the genomic DNA and the mRNA are hybridized, those portions of the DNA that are not represented in the mRNA are thrown into loops. The loops of the seven introns (A–G) can be distinguished. (COURTESY OF PIERRE CHAMBON.)

11.4 Synthesis and Processing of Eukaryotic Messenger RNAS

Figure 11.25 Visualizing an intron in the globin gene. Electron micrographs of hybrids formed between (a) 15S globin precursor RNA and the DNA of a globin gene and (b) 10S globin mRNA and the same DNA as in a. The red dotted lines in the insets indicate the positions of the RNA molecules. The precursor mRNA is equivalent in length and sequence to the DNA of the globin gene, but the 10S mRNA is missing a portion that is present in the DNA of the gene. These results suggest that the 15S RNA is processed by removing an internal RNA sequence and rejoining the flanking regions. (FROM SHIRLEY M. TILGHMAN ET AL., PROC. NAT ’L. ACAD. SCI. U.S.A. 75:1312, 1978.)

448

trophin gene extends for roughly 100 times the length needed to code for its corresponding message, and the type I collagen gene contains over 50 introns. The average human gene contains about 9 introns making up more than 95 percent of the transcription unit. These and other findings provided strong evidence for the proposition that mRNA formation in eukaryotic cells occurs by the removal of internal sequences of ribonucleotides from a much larger pre-mRNA. Let us turn to the steps by which this occurs.

The Processing of Eukaryotic Messenger RNAs RNA polymerase II assembles a primary transcript that is complementary to the DNA of the entire transcription unit. Electron microscopic examination of transcriptionally active genes indicates that RNA transcripts become associated with proteins and larger particles while they are still in the process of being synthesized (Figure 11.27). These particles, which consist of proteins and ribonucleoproteins, include the agents responsible for converting the primary transcript into a mature messenger. This conversion process requires addition of a 5⬘ cap and 3⬘ poly(A) tail to the ends of the transcript, and removal of any intervening introns. Once processing is completed, the mRNP, which consists of mRNA and associated proteins, is ready for export from the nucleus.

Chapter 11 Gene Expression: From Transcription to Translation

5⬘ Caps and 3⬘ Poly(A) Tails The 5⬘ ends of all RNAs initially possess a triphosphate derived from the first nucleoside triphosphate incorporated at the site of initiation of RNA synthesis. Once the 5⬘ end of an mRNA precursor has been synthesized, several enzyme activities act on this end of the

molecule (Figure 11.28). In the first step, the last of the three phosphates is removed, converting the 5⬘ terminus to a diphosphate (step 1, Figure 11.28). Then, a GMP is added in an inverted orientation so that the 5⬘ end of the guanosine is facing the 5⬘ end of the RNA chain (step 2, Figure 11.28). As a result, the first two nucleosides are joined by an unusual 5⬘–5⬘ triphosphate bridge. Finally, the terminal, inverted guanosine is methylated at the 7 position on its guanine base, while the nucleotide on the internal side of the triphosphate bridge is methylated at the 2⬘ position of the ribose (step 3, Figure 11.28). The 5⬘ end of the RNA now contains a methylguanosine cap (shown in greater detail in Figure 11.21). These enzymatic modifications at the 5⬘ end of the primary transcript occur very quickly, while the RNA molecule is still in its very early stages of synthesis. In fact, the capping enzymes are recruited by the phosphorylated CTD of the polymerase (see Figure 11.34). The methylguanosine cap at the 5⬘ end of an mRNA serves several functions: it prevents the 5⬘ end of the mRNA from being digested by exonucleases, it aids in transport of the mRNA out of the nucleus, and it plays an important role in the initiation of mRNA translation. As noted above, the 3⬘ end of an mRNA contains a string of adenosine residues that forms a poly(A) tail. As a number of mRNAs were sequenced, it became evident that the poly(A) tail typically begins 10 to 30 nucleotides downstream from the sequence AAUAAA. This sequence in the primary transcript serves as a recognition site for the assembly of a large complex of proteins that carry out the processing reactions at the 3⬘ end of the mRNA (Figure 11.28). The poly(A) processing complex is also physically associated with the phosphorylated CTD of RNA polymerase II as it synthesizes the primary transcript (see Figure 11.34). Figure 11.27 Pre-mRNA transcripts are processed as they are synthesized (i.e., cotranscriptionally). (a) Electron micrograph of a nonribosomal transcription unit showing the presence of ribonucleoprotein particles attached to the nascent RNA transcripts. (b) Interpretive tracing of the micrograph shown in part a. The dotted line represents the chromatin (DNA) strand, the solid lines represent ribonucleoprotein (RNP) fibrils, and solid circles represent RNP particles associated with the fibrils. Individual transcripts are numbered, beginning with 1, which is closest to the point of initiation. The RNP particles are not distributed randomly along the nascent transcript, but rather are bound at specific sites where RNA processing is taking place. (FROM ANN L. BEYER, OSCAR L. MILLER, JR., AND STEVEN L. MCKNIGHT, CELL 20:78, 1980, REPRODUCED WITH PERMISSION FROM

ELSEVIER.)

(a)

(b)

449

Primary transcript

5'

Included among the proteins of the processing complex is an endonuclease (Figure 11.28, top) that cleaves the premRNA downstream from the recognition site. Following cleavage by the nuclease, an enzyme called poly(A) polymerase adds 250 or so adenosines without the need of a template (steps a–c, Figure 11.28). As discussed in Section 12.6, the poly(A) tail together with an associated protein protects the mRNA from premature degradation by exonucleases. It is important to note that many genes possess more than one poly(A) recognition sequence in their 3⬘ noncoding segments. These genes can be transcribed into mRNAs whose 3⬘ noncoding segment (3⬘ UTR) have different lengths and thus can be subject to different regulatory influences.

3' Processing complex includes endonuclease and poly(A) polymerase

RNA Triphosphatase Poly(A) polymerase ATP 5' 3'

5'

GTP 3'

5' Pi

1

RNA Splicing: Removal of Introns from a Pre-RNA The key steps in the processing of a pre-mRNA are shown in Figure 11.29. In addition to formation of the 5⬘ cap and poly(A) tail, which have already been discussed, those parts of a primary transcript that correspond to the intervening DNA sequences (the introns) must be removed by a complex process known as RNA splicing. To splice an RNA, breaks in the strand must be introduced at the 5⬘ and 3⬘ ends (the splice sites) of each intron, and the exons situated on either side of the splice sites must be covalently joined (ligated). It is imperative that the splicing process occur with absolute precision, because the addition or loss of a single nucleotide at any of the splice junctions would cause the resulting mRNA to be mistranslated. How does the same basic splicing machinery recognize the exon–intron boundaries in thousands of different pre-

a

GMP 5' 5' PPi 2

3'

Guanylyltransferase

b

5'

AMP

PPi

RNA Methyltransferases ATP

Methyl

5'

Globin gene 3'

DNA

3' Primary transcript 3

Methylguanosine cap

c

mG

Poly(A) tail

Figure 11.28 Steps in the addition of a 5⬘ methylguanosine cap and a 3⬘ poly(A) tail to a pre-mRNA. The 5⬘ end of the nascent premRNA binds to a capping enzyme, which in mammals has two active sites that catalyze different reactions: a triphosphatase that removes the terminal phosphate group (step 1) and a guanylyltransferase that adds a guanine residue in a reverse orientation, by means of a 5⬘-to-5⬘ linkage (step 2). In step 3, different methyltransferases add a methyl group to the terminal guanosine cap and to the ribose of the nucleotide that had been at the end of the nascent RNA. A protein complex (called CBC) binds to the completed cap (not shown). A very different series of events occurs at the 3⬘ end of the pre-mRNA, where a large protein complex is assembled. First, an endonuclease cleaves the primary RNA transcript, generating a new 3⬘ end upstream from the original 3⬘ terminus. In steps a–c, poly(A) polymerase adds adenosine residues to the 3⬘ end without the involvement of a DNA template. A typical mammalian mRNA contains 200 to 250 adenosine residues in its completed poly(A) tail; the number is considerably less in lower eukaryotes. (AFTER D. A. MICKLOS AND G. A. FREYER, DNA SCIENCE, CAROLINA BIOLOGICAL SUPPLY CO.)

Removal of 3' end by nuclease and polyadenylation 15S mG Processing intermediate

mG

AAA Endonucleolytic cleavage at splice junctions AAA

Ligation of exons 10S Mature mRNA

mG

AAA

Figure 11.29 Overview of the steps during the processing of the globin mRNA. Introns are shown in brown, whereas green portions of the gene indicate the positions of the exons, that is, the DNA sequences that are represented in the mature messenger RNA.

11.4 Synthesis and Processing of Eukaryotic Messenger RNAS

Transcription by RNA polymerase II and capping

450 5' splice site 5'

Exon

Exon

Intron A C

Chapter 11 Gene Expression: From Transcription to Translation

3' splice site

AG GU

A G

AGU

YUNAY

YYYYYYYYYYN

Branch point sequence

Polypyrimidine tract

C U

AG G

3'

G U

Figure 11.30 Nucleotide sequences at the splice sites of pre-mRNAs. In addition to encoding the information to construct a polypeptide, a pre-mRNA must also contain information that directs the machinery responsible for RNA splicing. The nucleotide sequences shown in the regions of the splice sites are based on analysis of a large number of pre-mRNAs and are therefore referred to as consensus sequences. The bases shown in red are virtually invariant; those in black

represent the preferred base at that position. N represents any of the four nucleotides; Y represents a pyrimidine. The polypyrimidine tract near the 3⬘ splice site typically contains between 10 and 20 pyrimidines. The branch point sequence shown is that found in human pre-mRNAs and is typically about 30 bases upstream of the 3⬘ end of the intron.

mRNAs? Examination of hundreds of junctions between exons and introns in eukaryotes ranging from yeast to insects to mammals revealed the presence in splice sites of a conserved nucleotide sequence of ancient evolutionary origin. The sequence most commonly found at the exon–intron borders within mammalian pre-mRNA molecules is shown in Figure 11.30. The G/GU at the 5⬘ end of the intron (the 5⬘ splice site), the AG/G at the 3⬘ end of the intron (the 3⬘ splice site), and the polypyrimidine tract near the 3⬘ splice site are present in the vast majority of eukaryotic pre-mRNAs.4 In addition, the adjacent regions of the intron contain preferred nucleotides, as indicated in Figure 11.30, which play an important role in splice site recognition. The sequences depicted in Figure 11.30 are necessary for splice-site recognition, but they are not sufficient. Introns typically run for thousands of nucleotides and often contain internal segments that match the consensus sequence shown in Figure 11.30—but the cell doesn’t recognize them as splicing signals, and consequently ignores them. The additional clues that allow the splicing machinery to distinguish between exons and introns are provided by specific sequences, most notably the exonic splicing enhancers, or ESEs, situated within exons (see Figure 11.32, inset A). Changes in DNA sequence within either a splice site or an ESE can lead to the inclusion of an intron or the exclusion of an exon. It is estimated that approximately 15 percent of inherited human disease results directly from mutations that alter pre-mRNA splicing. In addition, much of the “normal” genetic variation in susceptibility to common diseases that is present in the human population (page 417) may result from the effects of this variation on RNA splicing efficiency. Understanding the mechanism of RNA splicing has come about through an appreciation of the remarkable capabilities of RNA molecules. The first evidence that RNA molecules were capable of catalyzing chemical reactions was obtained in 1982 by Thomas Cech and colleagues at the University of Colorado. As discussed at length in the Experimental Path-

ways for this chapter, these researchers found that the ciliated protozoan Tetrahymena synthesized an rRNA precursor (a pre-rRNA) that was capable of splicing itself. In addition to revealing the existence of RNA enzymes, or ribozymes, these experiments changed the thinking among biologists about the relative roles of RNA and protein in the mechanism of RNA splicing. The intron in the Tetrahymena pre-rRNA is an example of a group I intron, which won’t be discussed. Another type of self-splicing intron, called a group II intron, was subsequently discovered in fungal mitochondria, plant chloroplasts, and a variety of bacteria and archaea. Group II introns fold into a complex structure shown two-dimensionally in Figure 11.31a. Group II introns undergo self-splicing by passing through an intermediate stage, called a lariat (Figure 11.31b) because it resembles the type of rope used by cowboys to catch runaway calves. The first step in group II intron splicing is the cleavage of the 5⬘ splice site (step 1, Figure 11.31b), followed by formation of a lariat by means of a covalent bond between the 5⬘ end of the intron and an adenosine residue near the 3⬘ end of the intron (step 2). The subsequent cleavage of the 3⬘ splice site releases the lariat and allows the cut ends of the exon to be covalently joined (ligated) (step 3). The steps that occur during the removal of introns from pre-mRNA molecules in eukaryotic cells are quite similar to those followed by group II introns. The primary difference is that the pre-mRNA is not able to splice itself, requiring instead a host of small nuclear RNAs (snRNAs) and their associated proteins. As each large hnRNA molecule is transcribed, it becomes associated with a variety of proteins to form an hnRNP (heterogeneous nuclear ribonucleoprotein), which represents the substrate for the processing reactions that follow. Processing occurs as each intron of the pre-mRNA becomes associated with a dynamic macromolecular machine called a spliceosome. Each spliceosome consists of a variety of proteins and a number of distinct ribonucleoprotein particles, called snRNPs because they are composed of snRNAs bound to specific proteins. Spliceosomes are not present within the nucleus in a prefabricated state, but rather are assembled as their component snRNPs bind to the premRNA. Once the spliceosome machinery is assembled, the

4

Approximately 1 percent of introns have AT and AC dinucleotides at their 5⬘ and 3⬘ ends (rather than GU and AG). These AT/AC introns are processed by a different type of spliceosome, called a U12 spliceosome, because it contains a U12 snRNA in place of the U2 snRNA of the major spliceosome.

451

III I

IV II

V A*

VI

Exons 5' 3'

Figure 11.31 The structure and self-splicing pathway of group II introns. (a) Two-dimensional structure of a group II intron (shown in red). The intron folds into six characteristic domains that radiate from a central structure. The asterisk indicates the adenosine nucleotide that bulges out of domain VI and forms the lariat structure as described in the text. The two ends of the intron become closely applied to one another as indicated by the proximity of the two intron–exon boundaries. (b) Steps in the self-splicing of group II introns. In step 1, the 2⬘ OH of an adenosine within the intron (asterisk in domain VI of part a) carries out a nucleophilic attack on the 5⬘ splice site, cleaving the RNA and forming an unusual 2⬘–5⬘ phosphodiester bond with the first nucleotide of the intron (step 2). This branched structure is described as a lariat. Also shown in step 2, the free 3⬘ OH of the displaced exon attacks the 3⬘ splice site, which cleaves the RNA at the other end of the intron. As a result of this reaction, the intron is released as a free lariat, and the 3⬘ and 5⬘ ends of the two flanking exons are ligated (step 3). A similar pathway is followed in the splicing of introns from pre-mRNAs, but rather than occurring by self-splicing, these steps require the aid of a number of additional factors.

(a) 1

Exon 1 Pre-RNA

Intron

Exon 2

5' splice site

3' splice site A

G

2'OH 2

Exon 1 + intron–exon 2 Exon 2

G

3'

5'

OH

A 2'

Exon 1

Intron lariat + ligated exons Exon 1 A 2' G

Exon 2

+

5'

(b)

snRNPs carry out the reactions that cut the introns out of the transcript and paste the ends of the exons together. The excised introns, which constitute more than 95 percent of the average mammalian pre-mRNA, are simply degraded within the nucleus. Our understanding of the steps in RNA splicing has been achieved largely through studies of cell-free extracts that can accurately splice pre-mRNAs in vitro. Some of the major steps in the assembly of a spliceosome and removal of an intron are indicated in Figure 11.32 and described in some de-

11.4 Synthesis and Processing of Eukaryotic Messenger RNAS

3

tail in the accompanying legend. Taken together, removal of an intron requires several snRNP particles: the U1 snRNP, U2 snRNP, U5 snRNP, and the U4/U6 snRNP, which contains the U4 and U6 snRNAs bound together. In addition to its snRNA, each snRNP contains a dozen or more proteins. One family, called Sm proteins, are present in all of the snRNPs. Sm proteins bind to one another and to a conserved site on each snRNA (except U6 snRNA) to form the core of the snRNP. Figure 11.33 shows a structural model of the U1 snRNP, with the locations of the snRNA, Sm proteins, and other proteins within the particle indicated. Sm proteins were first identified because they are the targets of antibodies produced by patients with the autoimmune disease systemic lupus erythematosus. The other proteins of the snRNPs are unique to each particle. The events described in Figure 11.32 provide excellent examples of the complex and dynamic interactions that can occur between RNA molecules. The multiple rearrangements among RNA molecules that occur during the assembly of a spliceosome are primarily mediated by ATP-consuming RNA helicases present within the snRNPs. RNA helicases can unwind double-stranded RNAs, such as the U4–U6 duplex shown in Figure 11.32, inset B, which allows the displaced RNAs to bind new partners. Spliceosomal helicases are also thought to strip RNAs from bound proteins, including the U2AF protein of Figure 11.32, inset A. At least eight different helicases have been implicated in splicing of pre-mRNAs in yeast. The fact that (1) pre-mRNAs are spliced by the same pair of chemical reactions that occur as group II introns splice themselves, and (2) the snRNAs that are required for splicing pre-mRNAs closely resemble parts of the group II introns, suggested that the snRNAs are the catalytically active components of the snRNPs, not the proteins. According to this scenario, the spliceosome would act as a ribozyme and the proteins would serve various supplementary roles, such as maintaining the proper three-dimensional structure of the snRNA and selecting the splice sites to be used during the processing of a particular pre-mRNA. Of the various snRNAs

452 1

5' Splice site

3' Splice site A

5'

3' Exon 1

Intron

Branchpoint

Exon 2 U1

2

5'

U1

3' A

U2

SR protein

Chapter 11 Gene Expression: From Transcription to Translation

Exon 1

U1 CAU UCA GU A U GU

5'

3'

U2 AUGAUGU UACUACA A

SR protein

U2AF

3'

A G Exon 2

ESE ESE Figure 11.32 Schematic 3 U2 model of the assembly of U1 5' 3' the splicing machinery and some B of the steps that occur during 3' U6 U6 5' pre-mRNA splicing. Step 1 U4 shows the portion of the pre5' U5 3' mRNA to be spliced. In step 2, the first of the splicing U4 components, U1 snRNP, has beExon 1 come attached at the 5⬘ splice site 4 of the intron. The nucleotide seU6 U5 quence of U1 snRNA is compleU1 U2 mentary to the 5⬘ splice site of Exon 2 C the pre-mRNA, and evidence in5' 3' A U4 C U dicates that U1 snRNP initially G A 5' U A binds to the 5⬘ side of the intron C G U6 C G by the formation of specific base U5 G A A 1 C A ACAGA G U pairs between the splice site and xon 5 E U6 C G UGU A UG U A Exon 1 5' U1 snRNA (see inset A). The U2 U5 Rx2 U A G C pre-mRNA – Rx1 snRNP is next to enter the splicU2 CA Exon 2 G A A ing complex, binding to the pre3' UGAUC U ACU ACA G 5' C U mRNA (as shown in inset A) in a A UG A UGU G A A C U A G U A way that causes a specific adenoU2 3' Exon 2 3' sine residue (dot) to bulge out of the surrounding helix (step 3). 6 This is the site that later becomes U6 the branch point of the lariat. U2 U5 is thought to be recruited by the U2 + 5' Exon 1 Exon 2 3' protein U2AF, which binds to the polypyrimidine tract near the U1 and U4 snRNA have been displaced, the U6 snRNA is in position 3⬘ splice site. U2AF also interacts with SR proteins that bind to the exto catalyze the two chemical reactions required for intron removal. Aconic splicing enhancers (ESEs). These interactions play an important cording to an alternate view, the reactions are catalyzed by the role in recognizing intron/exon borders. The next step is the binding of combined activity of U6 snRNA and a protein of the U5 snRNP. Rethe U4/U6 and U5 snRNPs to the pre-mRNA with the accompanying gardless of the mechanism, the first reaction (indicated by the arrow in displacement of U1 (step 4). The assembly of a spliceosome involves a inset C) results in the cleavage of the 5⬘ splice site, forming a free 5⬘ series of dynamic interactions between the pre-mRNA and specific exon and a lariat intron–3⬘ exon intermediate (step 5). The free 5⬘ exon snRNAs and among the snRNAs themselves. As they enter the comis thought to be held in place by its association with the U5 snRNA of plex with the pre-mRNA, the U4 and U6 snRNAs are extensively the spliceosome, which also interacts with the 3⬘ exon (step 5). The base-paired to one another (inset B). The U4 snRNA is subsequently first cleavage reaction at the 5⬘ splice site is followed by a second cleavstripped away from the duplex, and the regions of U6 that were paired age reaction at the 3⬘ splice site (arrow, step 5), which excises the lariat with U4 become base-paired to a portion of the U2 snRNA (inset C). intron and simultaneously joins the ends of the two neighboring exons Another portion of the U6 snRNA is situated at the 5⬘ splice site (inset (step 6). Following splicing, the snRNPs must be released from the C), having displaced the U1 snRNA that was previously bound there pre-mRNA, the original associations between snRNAs must be restored, (inset A). It is proposed that U6 is a ribozyme and that U4 is an and the snRNPs must be reassembled at the sites of other introns. inhibitor of its catalytic activity. According to this hypothesis, once the

453

(a)

Figure 11.33 The structure of an snRNP. (a) Model of a U1 snRNP particle based on biochemical data and structural information obtained by cryoelectron microscopy. At the core of the particle is a ring-shaped protein complex composed of the seven different Sm proteins that are common to all U snRNPs. Three other proteins are unique to the U1 snRNP (named 70K, U1-A, and U1-C). Stems I, II and IV are parts of the 165-base U1 snRNA. The snRNP is assembled in the cytoplasm and imported into the nucleus, where it carries out its function. (b) A model of the U1 snRNA in the same orientation as in a. A higher resolution model can be found in Nature 458: 475, 2009. (FROM HOLGER STARK ET AL., NATURE 409:541, 2001. REPRODUCED WITH PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

(b)

that participate in RNA splicing, U6 is considered the most likely candidate for the catalytic species. However, recent studies have placed at least one of the proteins of the U5 snRNP (namely, a conserved protein called Prp8) very close to the catalytic site of the spliceosome. In addition, Prp8 contains an RNase domain that might be well suited for cleaving the pre-mRNA. This finding has revived the proposal that the combined action of both an RNA and a protein component of the spliceosome is responsible for catalyzing the two chemical reactions required for RNA splicing. It was mentioned above that sequences situated within exons, called exonic splicing enhancers (ESEs), play a key role in recognition of exons by the splicing machinery. ESEs serve as binding sites for a family of RNA-binding proteins, called SR proteins—so named because of their large number of arginine (R)–serine (S) dipeptides. SR proteins are thought to form interacting networks that span the intron/exon borders

and help recruit snRNPs to the splice sites (see Figure 11.32, inset A). Positively charged SR proteins may also bind electrostatically to the negatively charged phosphate groups that are added to the CTD of the polymerase as transcription is initiated (Figure 11.20). As a result, the assembly of the splicing machinery at an intron occurs in conjunction with the synthesis of the intron by the polymerase. The CTD is thought to recruit a wide variety of processing factors. In fact, most of the machinery required for mRNA processing and export to the cytoplasm travels with the polymerase as part of a giant “mRNA factory” (Figure 11.34). Because most genes contain a number of introns, the splicing reactions depicted in Figure 11.32 must occur repeatedly on a single primary transcript. Evidence suggests that introns are removed in a preferred order, generating specific processing intermediates whose size lies between that of the primary transcript and the mature mRNA. An example of the

5' Spliceosome AA

P P

5 2

CTD

P

UA

P

5 2

P

Elongation factors

P

5

Cleavage/ polyadenylation factors

AA

2

P 5

RNAPII DNA Elongation

Figure 11.34 Schematic representation of a mechanism for the coordination of transcription, capping, polyadenylation, and splicing. In this simplified model, the C-terminal domain (CTD) of the large subunit of the RNA polymerase (page 443) serves as a flexible scaffold for the organization of factors involved in processing pre-mRNAs, including those for capping, polyadenylation, and intron removal. In addition to the proteins depicted here, the polymerase is probably associated with a host of transcription factors, as well as enzymes that modify the chromatin template. The proteins bound to

the polymerase at any particular time may depend on which of the serine residues of the CTD are phosphorylated. The pattern of phosphorylated serine residues changes as the polymerase proceeds from the beginning to the end of the gene being transcribed (compare to Figure 11.20). The phosphate groups linked to the #5 residues are largely lost by the time the polymerase has transcribed the 3⬘ end of the RNA. (AFTER E. J. STEINMETZ, CELL 89:493, 1997; BY PERMISSION OF CELL PRESS.)

11.4 Synthesis and Processing of Eukaryotic Messenger RNAS

5' Capping enzymes pre_mRNA

Chapter 11 Gene Expression: From Transcription to Translation

454

Figure 11.35 Processing the ovomucoid pre-mRNA. The photograph shows a Northern blot, a technique in which extracted RNA (in this case from the nuclei of hen oviduct cells) is fractionated by gel electrophoresis and blotted onto a membrane filter. The immobilized RNA on the filter is then incubated with a radioactively labeled cDNA (in this case a cDNA made from the ovomucoid mRNA) to produce bands that reveal the positions of RNAs containing the complementary sequence. The mature mRNA encoding the ovomucoid protein is 1100 nucleotides long and is shown at the bottom of the blot. It is evident that the nucleus contains a number of larger-sized RNAs that also contain the nucleotide sequence of the ovomucoid mRNA. The largest RNA on the blot has a length of 5450 nucleotides, which corresponds to the size of the ovomucoid transcription unit; this RNA is presumably the primary transcript from which the mRNA is ultimately carved. Other prominent bands contain RNAs with lengths of 3100 nucleotides (which corresponds to a transcript lacking introns 5 and 6), 2300 nucleotides (a transcript lacking introns 4, 5, 6, and 7), and 1700 nucleotides (a transcript lacking all introns except number 3). (COURTESY OF BERT O’MALLEY.)

intermediates that form during the nuclear processing of the ovomucoid mRNA in cells of the hen’s oviduct is shown in Figure 11.35.

Evolutionary Implications of Split Genes and RNA Splicing The discovery that RNAs are capable of catalyzing chemical reactions has had an enormous impact on our view of biological evolution. Ever since the discovery of DNA as the genetic material, biologists have wondered which came first, protein or DNA. The dilemma arose from the seemingly nonoverlapping functions of these two types of macromolecules. Nucleic acids store information, whereas proteins catalyze reactions.

With the discovery of ribozymes in the early 1980s, it became apparent that one type of molecule—RNA—could do both. These findings have fueled the belief that both DNA and protein were absent at an early stage in the evolution of life. During this period, RNA molecules performed double duty: they served as genetic material, and they catalyzed chemical reactions including those required for RNA replication. Life at this stage is described as an “RNA world.” Only at a later stage in evolution were the functions of catalysis and information storage taken over by protein and DNA, respectively, leaving RNA to function primarily as a go-between in the flow of genetic information. Many researchers believe that splicing provides an example of a legacy from an ancient RNA world. Although the presence of introns creates an added burden for cells, because they have to remove these intervening sequences from their transcripts, introns are not without their virtues. As we will see in the following chapter, RNA splicing is one of the steps along the path to mRNA formation that is subject to cellular regulation. Many primary transcripts can be processed by two or more pathways so that a sequence that acts as an intron in one pathway becomes an exon in an alternate pathway. As a result of this process, called alternative splicing, the same gene can code for more than one polypeptide. The presence of introns is also thought to have had a major impact on biological evolution. When the amino acid sequence of a protein is examined, it is often found to contain sections that are homologous to parts of several other proteins (see Figures 2.36 and 7.22 for examples). Proteins of this type are encoded by genes that are almost certainly composites made up of parts of other genes. The movement of genetic “modules” among unrelated genes—a process called exon shuffling—is greatly facilitated by the presence of introns, which act like inert spacer elements between exons. Genetic rearrangements require breaks in DNA molecules, which can occur within introns without introducing mutations that might impair the organism. Over time, exons can be shuffled independently in various ways, allowing a nearly infinite number of combinations in search for new and useful coding sequences. As a result of exon shuffling, evolution need not occur only by the slow accumulation of point mutations but might also move ahead by “quantum leaps” with new proteins appearing in a single generation.

Creating New Ribozymes in the Laboratory The main sticking point in the minds of many biologists concerning the feasibility of an RNA world in which RNA acted as the sole catalyst is that, to date, only a few reactions have been found to be catalyzed by naturally occurring RNAs. These include the cleavage and ligation of phosphodiester bonds required for RNA splicing and the formation of peptide bonds during protein synthesis. Are these the only types of reactions that RNA molecules are capable of catalyzing, or has their catalytic repertoire been sharply restricted by the evolution of more efficient protein enzymes? Several groups of researchers have explored the catalytic potential of RNA by creating new RNA molecules in the laboratory. Although

455

REVIEW 1. What is a split gene? How was the existence of split genes discovered? 2. What is the relationship between hnRNAs and mRNAs? How was this relationship uncovered? 3. What are the general steps in the processing of a pre-mRNA into an mRNA? What is the role of the snRNAs and the spliceosome?

4. What is meant by the term RNA world? What type of evidence argues for its existence? 5. What is meant by the phrase “test-tube evolution” as it applies to the catalytic activity of RNA molecules?

11.5 | Small Regulatory RNAs and RNA Silencing Pathways The idea that RNA molecules are directly involved in the regulation of gene expression began with a puzzling observation. The petals of a petunia plant are normally light purple. In 1990, two groups of investigators reported on an attempt to deepen the color of the flowers by introducing extra copies of a gene encoding a pigment-producing enzyme. To the surprise of the researchers, the presence of the extra genes caused the petals to lose their pigmentation rather than become more darkly pigmented as expected (Figure 11.36a). Subsequent studies indicated that, under these experimental conditions, both the added genes and their normal counterparts within

(b)

(a)

(c)

Figure 11.36 RNA interference. (a) Petunia plants normally have purple flowers. The flowers of this plant appear white because the cells contain an extra gene (a transgene) that encodes an enzyme required for pigment production. The added gene has triggered RNA interference leading to the specific destruction of mRNAs transcribed from both the transgene and the plant’s own genes, causing the flowers to be largely unpigmented. (b) A nematode worm containing a gene encoding a GFP fusion protein (page 273) that is expressed specifically in the animal’s pharynx. (c) This worm developed from a parent containing the same genotype as that shown in b, whose gonad had been injected with a solution of dsRNA that is complementary to the mRNA encoding the GFP fusion protein. The absence of visible staining reflects the destruction of the mRNA by RNA interference. (A: FROM DAVID BAULCOMBE, CURR. BIOL. 12:R82, 2002, WITH PERMISSION FROM ELSEVIER. COURTESY DAVID BAULCOMBE, SAINSBURY LABORATORY, NORWICH, UK.; B, C: REPRINTED FROM WWW.NEB. COM(2012) WITH PERMISSION FROM NEW ENGLAND BIOLABS.)

11.5 Small Regulatory RNAs and RNA Silencing Pathways

these experiments can never prove that such RNA molecules existed in ancient organisms, they provide proof of the principle that such RNA molecules could have evolved through a process of natural selection. In one approach, researchers have created catalytic RNAs from scratch without any preconceived design as to how the RNA should be constructed. The RNAs are produced by allowing automated DNA-synthesizing machines to assemble DNAs with random nucleotide sequences. Transcription of these DNAs produces a population of RNAs whose nucleotide sequences are also randomly determined. Once a population of RNAs is obtained, individual members can be selected from the population by virtue of particular properties they possess. This approach has been described as “test-tube evolution.” In one group of studies, researchers initially selected for RNAs that bound to specific amino acids and, subsequently, for a subpopulation that would transfer a specific amino acid onto the 3⬘ end of a targeted tRNA. This is the same basic reaction carried out by aminoacyl-tRNA synthetases, which are enzymes that link amino acids to tRNAs as required for protein synthesis (page 466). It is speculated that amino acids may have been used initially as adjuncts (cofactors) to enhance catalytic reactions carried out by ribozymes. Over time, ribozymes presumably evolved that were able to string specific amino acids together to form small proteins, which were more versatile catalysts than their RNA predecessors. As we will see later in this chapter, ribosomes—the ribonucleoprotein machines responsible for protein synthesis—are essentially ribozymes at heart, which provides strong support for this evolutionary scenario. As proteins took over a greater share of the workload in the primitive cell, the RNA world was gradually transformed into an “RNA–protein world.” At a later point in time, RNA was presumably replaced by DNA as the genetic material, which propelled life forms into the present “DNA–RNA– protein world.” The evolution of DNA may have required only two types of enzymes: a ribonucleotide reductase to convert ribonucleotides into deoxyribonucleotides and a reverse transcriptase to transcribe RNA into DNA. The fact that RNA catalysts do not appear to be involved in either DNA synthesis or transcription supports the idea that DNA was the last member of the DNA–RNA–protein triad to appear on the scene. Somewhere along the line of evolutionary progress, a code had to evolve that would allow the genetic material to specify the sequence of amino acids to be incorporated into a given protein. The nature of this code is the subject of a later section of this chapter.

456

the genome were being transcribed, but the resulting mRNAs were somehow degraded. The phenomenon became known as posttranscriptional gene silencing (PTGS).

It wasn’t until 1998 that the molecular basis for this form of gene silencing became understood. In that year, Andrew Fire of Carnegie Institute of Washington and Craig Mello of

a

Pri-miRNA Nucleus

Drosha Nuclear envelope

Cytoplasm

1

1

dsRNA

Pre-miRNA Dicer 5'-P

2

3'-HO

Dicer

OH-3' siRNA P-5'

miRNA Argonaute protein

Argonaute protein

2

Pre-RISC complex

Pre-RISC complex

P

3

3

P

siRNA

Passenger (miRNA*) strand removal

Passenger strand removal

RISC

RISC P

4

Target RNA

miRNA

4

Target mRNA Target cleavage RISC

Chapter 11 Gene Expression: From Transcription to Translation

P

Ribosome

5

RISC

5

X Translation inhibited or mRNA degraded

(a)

Figure 11.37 The formation and mechanism of action of siRNAs and miRNAs. (a) In step 1, both strands of a double-stranded RNA are cleaved by the endonuclease Dicer to form a small (21–23 nucleotide) siRNA, which has overhanging ends (step 2). In step 3, the siRNA becomes associated with a protein complex, a pre-RISC, that contains an Argonaute protein (typically Ago2) capable of cleaving and removing the passenger strand of the siRNA duplex. In step 4, the single-stranded guide siRNA, in association with proteins of the RISC complex, binds to a target RNA that has a complementary sequence. The target RNA might be a viral RNA, a transcript from a transposon, or an mRNA, depending on circumstances. In step 5, the target RNA is cleaved at a specific site by the Argonaute protein and subsequently degraded. (b) MicroRNAs are derived from single-stranded precursor RNAs that contain complementary sequences that allow them to fold back on themselves to form a double-stranded RNA with a stem-loop at one end (step a). This pseudo-dsRNA (or pri-miRNA) is cleaved at a specific site near its terminal loop by a protein complex containing an endonuclease named Drosha to generate a pre-miRNA that has a 3⬘

(b)

overhang at one end. The pre-miRNA is exported to the cytoplasm (step 1) where it is cleaved by Dicer into a small duplex miRNA (step 2) that has a 3⬘ overhang at both ends. In step 3, the double-stranded RNA becomes associated with a protein complex containing an Argonaute protein (typically Ago1), leading to the separation of the strands and removal of the passenger strand (called miRNA*). The single-stranded guide miRNA then binds to a complementary region on an mRNA (step 4) and inhibits translation of the message as shown in step 5 (or alternatively leads to deadenylation and degradation of the mRNA as discussed in Section 12.6). Unlike siRNAs, miRNAs that inhibit translation are only partially complementary to the target mRNA, hence the bulge. Most plant miRNAs and a handful of animal miRNAs are precisely complementary to the mRNA, or nearly so; in these cases, the outcome of the interaction tends to be cleavage of the mRNA by Ago2 in the same manner shown in a. (A certain class of miRNAs derived from introns do not form long hairpin pri-miRNAs and do not require Drosha for processing.)

457

strand) is incorporated, together with its Argonaute partner, into a related protein complex named RISC. Several different varieties of RISCs have been identified, distinguished by the particular Argonaute protein they contain. In animal cells, RISCs that contain an siRNA typically include the Argonaute protein Ago2. The RISC provides the machinery for the tiny single-stranded siRNA to bind to an RNA having a complementary sequence. Once bound, the target RNA is cleaved at a specific site by the ribonuclease activity of the Ago2 protein. Thus, the siRNA acts like a guide to direct the complex toward a complementary target RNA, which is then destroyed by a protein partner. Each siRNA can orchestrate the destruction of numerous copies of the target RNA, which might be a viral transcript, an RNA transcribed from a transposon or retrotransposon, or a host mRNA that is being targeted by a researcher as described in the beginning of this section. RNA interference has been studied largely in plants and nematode worms. In fact, it is generally accepted that vertebrates do not utilize RNAi as a defense against viruses, relying instead on a well-developed immune system.5 In fact, when a dsRNA is added to mammalian cells growing in culture, or injected directly into the body of a mammal, rather than stopping the translation of a specific protein it usually initiates a global response that inhibits protein synthesis in general. This global shutdown of protein synthesis (discussed in Section 17.1) is thought to have evolved as a means to protect cells from infection by viruses. To get around this largescale global response, researchers turned to the use of very small dsRNAs. They found that introduction into mammalian cells of synthetic dsRNAs that were 21 nucleotides long (i.e., equivalent in size to the siRNAs produced as intermediates during RNA interference in other organisms) did not trigger global inhibition of protein synthesis. Moreover, such dsRNAs were capable of RNAi, that is, they inhibited synthesis of the specific protein encoded by an mRNA having a matching nucleotide sequence. Proteins encoded by other mRNAs were generally not affected. This technique has become a major experimental strategy to learn more about the function of specific genes. One can simply knock out (or, more accurately, knock down) the activity of the specific gene in question, using dsRNAs to destroy the transcribed mRNAs, and look for any cellular abnormalities that result from a deficiency of the encoded protein (see Figures 8.8 and 18.51). Libraries containing thousands of siRNAs have been constructed for use in experiments aimed at determining the activities of large numbers of genes in a single study. The potential clinical importance of RNAi is discussed in the accompanying Human Perspective.

5

Mammals possess all of the components needed to generate siRNAs. Recent studies have indicated that siRNAs (or endo-siRNAs, as they are called) are produced in mammalian oocytes and embryonic stem cells, in part as a mechanism of defense against the movement of transposable elements in these cells. Endo-siRNAs may also be derived from other types of dsRNAs that form from the binding of sense and antisense RNAs or long RNAs that fold back to form double-stranded hairpins. The importance of endo-siRNAs in mammals is unclear.

11.5 Small Regulatory RNAs and RNA Silencing Pathways

the University of Massachusetts and their colleagues conducted an experiment on the nematode C. elegans. They injected these worms with several different preparations of RNA hoping to stop production of a particular muscle protein. One of the preparations contained “sense” RNA, that is, an RNA having the sequence of the mRNA that encoded the protein being targeted; another preparation contained “antisense” RNA, that is, an RNA having the complementary sequence of the mRNA in question; and a third preparation consisted of a double-stranded RNA containing both the sense and antisense sequences bound to one another. Neither of the single-stranded RNAs had much of an effect, but the double-stranded RNA was very effective in stopping production of the encoded protein. Fire and Mello described the phenomenon as RNA interference (RNAi). They demonstrated that double-stranded RNAs (dsRNAs) were taken up by cells where they induced a response leading to the selective destruction of mRNAs having the same sequence as the added dsRNA. Let’s say, for example, one sought to stop the production of the enzyme glycogen phosphorylase within the cells of a nematode worm so that the effect of this enzyme deficiency on the phenotype of the worm could be determined. Remarkably, this result can be attained by simply placing the worm in a solution of dsRNA that shares the same sequence as the target mRNA. A similar experiment is shown in Figure 11.36b,c. This phenomenon is similar in effect, though entirely different in mechanism, to the formation of knockout mice that are lacking a particular gene that encodes a particular protein. The phenomenon of dsRNA-mediated RNA interference is an example of the broader phenomenon of RNA silencing, in which small RNAs, typically working in conjunction with protein machinery, act to inhibit gene expression in various ways. RNAi is thought to have evolved as a type of “genetic immune system” to protect organisms from the presence of foreign or unwanted genetic material. To be more specific, RNAi probably evolved as a mechanism to block the replication of viruses and/or to suppress the movements of transposons within the genome because both of these potentially dangerous processes can involve the formation of double-stranded RNAs. Cells can recognize dsRNAs as “undesirable” because such molecules are not produced by the cell’s normal genetic activities. The steps involved in the RNAi pathway are shown in Figure 11.37a. The double-stranded RNA that initiates the response is first cleaved into small (21–23 nucleotide), double-stranded fragments, called small interfering RNAs (siRNAs), by a particular type of ribonuclease, called Dicer. Enzymes of the class to which Dicer belongs act specifically on dsRNA substrates and generate small dsRNAs that have 3⬘ overhangs, as shown in Figure 11.37a. These small dsRNAs are then loaded into a complex (identified as a pre-RISC in Figure 11.37a) that contains a member of the Argonaute family of proteins. Argonaute proteins play a key role in all of the known RNA silencing pathways. In the RNAi pathway, one of the strands of the RNA duplex (called the passenger strand) is cleaved in two and then dissociates from the preRISC. The other strand of the RNA duplex (called the guide

458

T H E

H U M A N

P E R S P E C T I V E

Chapter 11 Gene Expression: From Transcription to Translation

Clinical Applications of RNA Interference Medical scientists are continually searching for “magic bullets,” therapeutic compounds that fight a particular disease in a highly specific manner without toxic side effects. We can consider two major types of diseases—viral infections and cancer—that have become targets for a new type of molecular “magic bullet.” Viruses are able to ravage an infected cell because they synthesize messenger RNAs that encode viral proteins that disrupt cellular activities. Most cells that become cancerous contain mutations in certain genes (called oncogenes), causing them to produce mutant mRNAs that are translated into abnormal versions of cellular proteins. Consider what might happen if a patient with one of these diseases could be treated with a drug that destroyed or inhibited the specific mRNAs transcribed from the viral genome or mutant cancer gene while, at the same time, ignoring all the other mRNAs of the cell. Several strategies have been developed in recent years that have this goal in mind; the most recent of these strategies takes advantage of the phenomenon of RNA interference. As discussed above, mammalian cells can be subjected to RNAi—a process that leads to the degradation of a specific mRNA—by introducing the cells to a double-stranded siRNA in which one of the strands is complementary to the mRNA being targeted. When cells are incubated with a 21- to 23-nucleotide synthetic siRNA, the molecules are taken up by the cells, and incorporated into an mRNA-cleaving ribonucleoprotein complex (as in Figure 11.37a), which attacks the complementary mRNA. Alternatively, RNAi can be induced within mammalian cells that have been genetically engineered to carry a gene with inverted repeats. When the gene is transcribed, the RNA product folds back on itself to form a hairpin-shaped (i.e., double-stranded) siRNA precursor (similar to that in Figure 11.37b) that is processed into an active siRNA. The use of synthetic siRNAs is likely to have a transient effect, which may or may not be beneficial depending on the condition being treated. In contrast, the use of viral vectors that incorporate DNA into the cell’s genome has the potential to generate a long-lasting therapeutic effect with a single application. At the same time, however, the use of viruses raises other safety issues. As with all potential drug technologies, the first stages in development of RNAi therapeutics involved preclinical studies in which siRNAs were tested in cells from diseased patients or in animal models (i.e., animals with induced diseases similar to those that afflict humans). Many of these preclinical studies suggested that RNAi held great promise in the treatment of a wide range of diseases and viral infections. We can consider just a few examples. As noted above, cancer cells typically contain one or more mutated genes that encode abnormal proteins responsible for producing the cancerous phenotype. A type of leukemia, for example, is caused by a gene, called BCR-ABL, that is formed by the fusion of two normal genes. An siRNA against the mRNA produced by the BCR-ABL fusion gene proved capable of converting the malignant cells to a normal phenotype in culture. Similarly, an siRNA against a gene carrying an expanded CAG tract of the type that causes Huntington’s disease (page 404) was shown to block the production of the abnormal protein. A large number of preclinical studies probing the therapeutic value of siRNAs have focused on viral infection. Administration of siRNAs complementary to influenza, HIV, or hepatitis viral sequences has prevented or eliminated infections by these viruses in laboratory animals. In the case of AIDS, it is hoped that stem cells can be isolated from a patient, transfected with a vector carrying an

siRNA that targets an HIV-encoded mRNA, and then reinfused back into the patient’s bloodstream. Such cells have the potential to produce the siRNA more or less continuously, causing the cell and its descendants to be resistant to destruction by the virus. There are roadblocks, however, for using RNAi to treat viral infections. The hallmark of RNAi is its extraordinary sequence specificity, which can be both a virtue and a handicap. Ultimately, RNAi may prove to be ineffective against viruses, because these pathogens tend to mutate rapidly. Mutations result in changes in genome sequence, leading to the production of mRNAs that are no longer fully complementary to the therapeutic siRNA. Even though it has been barely more than a decade since RNAi first appeared as an obscure phenomenon in plants and worms (page 455), siRNA technology is now the basis of a growing list of clinical trials. The first test of RNAi therapeutics in human patients came against a form of macular degeneration that is caused by the overgrowth of blood vessels behind the retina of elderly patients, causing progressive loss of vision. The excessive growth of these blood vessels is spurred by the production of a growth factor, VEGF, which induces its effect by binding to the cell-surface receptor VEGFR1. In clinical trials, siRNAs targeting the mRNAs encoding VEGF (or VEGFR1) were injected directly into the eyes of patients with this disease. Early-stage trials suggested that the siRNA was having a beneficial effect but later-stage phase III trials failed to support the findings and work on the drug (called Bevasiranib) was suspended. Another series of clinical tests have begun that are directed against a respiratory virus, RSV, which is a common cause of infant hospitalization. In these trials, adult subjects inhale an aerosol that contains an siRNA (named ALN-RSV01) directed against mRNAs encoding a key viral protein. This aerosol treatment had proven highly effective in combating these respiratory infections in laboratory animals. Clinical trials of siRNAs aimed at the treatment of high blood cholesterol, certain cancers, hepatitis, pandemic flu, and other diseases are either planned or underway.

(a)

(b)

(c)

Figure 1 The effects of siRNA on the gut epithelium of mice with induced inflammatory disease. (a) Histological appearance of a section through the intestine of a normal mouse. (b) Appearance of the intestine of a mouse that had developed inflammatory coltis but was not treated with a ␤7-targeted siRNA. (c) Appearance of the intestine of a mouse that had been treated with a ␤7 -targeted siRNA directed against an mRNA encoding cyclin D1. The siRNA has targeted the inflammationpromoting white blood cells and led to a dramatic reduction in intestinal tissue damage. (FROM DAN PEER ET AL., COURTESY OF MOTOMU SHIMAOKA, SCIENCE 319:630, 2008, FIG. 4C, © 2008, REPRINTED WITH PERMISSION FROM AAAS.)

459 Age-related macular degeneration and RSV respiratory infection are not typical diseases in the sense that they can be treated with local application of an RNAi therapeutic. As with other types of gene therapies, a major obstacle in using RNAi to treat most diseases or infections is the difficulty in delivering siRNAs (or the viral vectors that encode the hairpin siRNA precursors) to affected tissues deep within the body. In an attempt to meet this challenge, a variety of nanosized particles have been developed that (1) either bind to siRNAs or encapsulate them and (2) target the siRNAs to the appropriate tissue or organ following their injection into the bloodstream. We can briefly consider two representative examples of studies that utilize specific targeting strategies. In one report, liposomelike nanoparticles were constructed to contain an antibody that binds to the ␤7 integrin subunit on the surface of a specific subset of white blood cells known to be involved in inflammation of the digestive tract. These nanoparticles were also constructed to contain an siRNA directed against the mRNA encoding cyclin D1, a regulatory molecule involved in the proliferation of these white blood cells. These siRNA-carrying nanoparticles were then injected into the bloodstream of mice in which inflammation had been induced. The results of the experiment are shown in Figure 1; the anti-␤7 antibody has succcessfully targeted the therapeutic agent to the site of inflammation, and inhibition of cyclin D1 synthesis by the siRNA has led to a dramatic reversal of the disease condition. This study demon-

strates the potential for siRNA therapeutics in the treatment of severe inflammatory bowel diseases (IBDs), such as colitis and Crohn’s disease, which are presently treated with drugs that can have debilitating side effects. In another report published in 2010, clinical researchers demonstrated for the first time that siRNAs can be successfully delivered to solid human tumors. Targeted delivery of the siRNA was accomplished by intravenous injection of a synthetic nanoparticle into three patients with metastatic melanoma. The nanoparticles consisted of (1) a dextrin polymer, (2) a human transferrin protein that targets the nanoparticle to transferrin receptors on the surface of the cancer cells, and (3) a coating of polyethylene glycol to protect the particles in the bloodstream against destruction by immune cells. In this case, the bound siRNAs targeted an mRNA that encodes a protein called ribonucleotide reductase M2 (RRM2) that is overexpressed in cancer cells. Nanoparticles were found to accumulate in patients’ tumors, and cleavage of the target mRNA and reduction in the level of the target protein were demonstrated. These results confirm that an injected siRNA can engage the human RNAi machinery and deliver a gene-sepecific therapeutic agent. The studies described here are only the beginning steps in a long research pathway, but they suggest that RNA interference may one day become a valuable therapeutic strategy against a diverse array of disorders.

MicroRNAs: Small RNAs that Regulate Gene Expression

miR-124a (a)

miR-206 (b)

miR-122 (c)

Figure 11.38 MicroRNAs are synthesized in specific tissues during embryonic development. These micrographs of zebrafish embryos show the specific expression of three different miRNAs whose localization is indicated by the blue stain. miR-124a is expressed specifically in the nervous system (a), miR-206 in skeletal muscle (b) and miR-122 in the liver (c). (FROM ERNO WIENHOLDS, WIGARD KLOOSTERMAN, ET AL, SCIENCE 309:311, 2005, COURTESY OF RONALD H. A. PLASTERK; © 2005, REPRINTED WITH PERMISSION FROM AAAS.)

11.5 Small Regulatory RNAs and RNA Silencing Pathways

By 1993, it was known that C. elegans embryos lacking the gene lin-4 were unable to develop into normal late-stage larvae. In that year, Victor Ambros, Gary Ruvkun, and their colleagues at Harvard University reported that the lin-4 gene encodes a small RNA that is complementary to segments in the 3⬘ untranslated region of a specific mRNA encoding the protein LIN-14. They proposed that, during larval development, the lin-4 RNA binds to the complementary mRNA, blocking translation of the message, which triggers a transition to the next stage of development. Mutants that are unable to produce the small lin-4 RNA possess an abnormally high level of the LIN-14 protein and cannot transition normally to later larval stages. This was the first clear example of RNA silencing of gene expression, but several years passed before the broader importance of these findings was appreciated. In 2000, it was shown that one of these tiny worm RNAs—a 21nucleotide species called let-7—is highly conserved throughout evolution. Humans, for example, encode several RNAs that are either identical or nearly identical to let-7. This observation led to an explosion of interest in these RNAs. It has become evident in recent years that both plants and animals produce a collection of tiny RNAs named microRNAs (miRNAs) that, because of their small size, had been overlooked for decades. As first discovered in nematodes, specific miRNAs such as lin-4 and let-7 are synthesized only at certain times during development, or in certain tissues of a plant or animal, and are thought to play a regulatory role. An example of the selective expression of specific miRNAs during the development of the zebrafish is shown in Figure 11.38. The size of miRNAs, at roughly 21–24 nucleotides in length, places them

Chapter 11 Gene Expression: From Transcription to Translation

460

in the same size range as the siRNAs involved in RNAi. This observation is more than coincidence, since miRNAs are produced by a similar processing machinery to that responsible for the formation of siRNAs. miRNAs and siRNAs might be considered to be “cousins,” as both act in posttranscriptional RNA silencing pathways. There are, however, important differences. An siRNA is derived from the double-stranded product of a virus or transposable element (or a dsRNA provided by a researcher), and it targets the same transcripts from which it arose. In contrast, an miRNA is encoded by a conventional segment of the genome and targets a specific mRNA as part of a normal cellular program. In other words, siRNAs serve primarily to maintain the integrity of the genome, whereas miRNAs serve primarily to regulate gene expression. The pathway for synthesis of a typical miRNA is shown in Figure 11.37b. Most miRNAs are synthesized by RNA polymerase II as a primary transcript with a 5⬘ cap and poly(A) tail. This primary transcript folds back on itself to form a long, double-stranded, hairpin-shaped RNA called a pri-miRNA shown in step a of Figure 11.37b. The primiRNA is cleaved within the nucleus by an enzyme called Drosha into a shorter, double-stranded, hairpin-shaped precursor (or pre-miRNA) shown in step 1. The pre-miRNA is exported to the cytoplasm where it gives rise to the small double-stranded miRNA. The miRNA is carved out of the premiRNA by Dicer, the same ribonuclease responsible for the formation of siRNAs. As with siRNAs, the double-stranded miRNA becomes associated with an Argonaute protein in whose company the RNA duplex is disassembled, and one of the single strands is incorporated into a RISC complex as shown in Figure 11.37b. A typical miRNA in animal cells is partially complementary to a region in the 3⬘ UTR of the mRNA target (as in Figure 11.37b). In these cases, base-pairing involves six or seven nucleotides near the 5⬘ end of the miRNA (typically nucleotides 2–8 of the miRNA), which is referred to as the “seed” region, as well as a few additional base pairs elsewhere in the miRNA. The miRNA of the RISC complex guides the associated Argonaute protein into close proximity to the bound mRNA. As discussed in Section 12.6, the Argonaute protein causes either the deadenylation and degradation of the mRNA or repression of its translation (see Figure 12.63). Some miRNAs, however, particularly those found in plants, direct the cleavage of a specific phosphodiester bond within the mRNA backbone. As with siRNAs shown in Figure 11.37a, mRNA cleavage is catalyzed by a “slicer” activity of the Ago2 protein present in the RISC complex. MicroRNAs that direct mRNA cleavage tend to be fully complementary to their mRNA target rather than complementary to only a portion of their target. MicroRNA genes have been identified in a number of different ways: most often from computer analysis of genomic DNA sequences, but also through isolation of mutants or by direct sequencing of small cellular RNAs. Estimates suggest that humans may encode over a thousand distinct miRNA species. Roughly one-third of human messenger RNAs contain sequences that are complementary to likely miRNAs, which provides a hint of the degree to which these small regulatory RNAs may be involved in the control of gene expres-

sion in higher organisms. A single miRNA may bind to dozens or even hundreds of different mRNAs. Conversely, many mRNAs contain sequences that are complementary to several different miRNAs, which suggests that these miRNAs may act in various combinations to “fine-tune” the level of gene expression. This prediction is supported by experiments in which cells are forced to take up and express specific miRNA genes. Under these conditions, large numbers of mRNAs in the genetically engineered cells are negatively affected. When different miRNA genes are introduced into these cells, different groups of mRNAs are downregulated, indicating that the effect is sequence-specific. This type of finetuning of mRNA expression levels by miRNAs contrasts sharply with the much more dramatic effects of certain miRNAs, such as lin-4, which reduce the expression of a target mRNA to the point that it triggers a major change in the course of development (page 459). MicroRNAs have been implicated in many processes, including the patterning of the nervous system, control of cell proliferation and cell death, leaf and flower development in plants, and differentiation of various cell types (Section 12.6). The roles of miRNAs in the development of cancer are explored at the end of Section 16.3. Dissecting the roles of individual miRNAs during mammalian development and tissue homeostasis promises to be a focus of research over the next decade.

piRNAs: A Class of Small RNAs that Function in Germ Cells We have seen in Chapter 10 that the movement of transposable elements poses a threat to the genome because it can disrupt the activity of a gene into which the element happens to insert. If this type of genomic jumping were to occur during the life of an adult liver or kidney cell, the consequences would likely be minimal since the transposition event affects only that particular cell and its daughters (if the somatic cell is still capable of cell division). If, however, a transposition event were to occur in a germ cell, that is, a cell that has the capability to give rise to gametes, then the event has the potential to affect every cell in an individual of the next generation. It is not surprising, therefore, that specialized mechanisms have evolved to suppress the movement of transposable elements in germ cells. It was noted on page 457 that one of the apparent functions of siRNAs is to prevent the movement of transposable elements. Recent studies indicate that germ cells of animals express a distinct class of small RNAs, called piwi-interacting RNAs (or piRNAs) that suppress the movement of transposable elements in the germ line. piRNAs get their name from the proteins with which they specifically associate. These proteins are called PIWIs, and they are a subclass of the Argonaute family, the same family of proteins that associate with siRNAs and miRNAs as part of their RISC complexes. The roles of the PIWI proteins have been best studied in fruit flies, where deletion of these proteins leads to defects in the suppression of transposon movement in germ cells and ultimate failure of gamete formation. piRNAs and their associated PIWI proteins are also required for successful gamete forma-

461

tion in male mice, although not in female mice (where protection against transposable elements may be primarily afforded by endo-siRNAs discussed in the footnote on page 457). There are a number of important differences between piRNAs on one hand and si/miRNAs on the other: (1) piRNAs are longer than these other small RNAs, measuring about 24–32 nucleotides in length; (2) the majority of mammalian piRNAs can be mapped to a small number of huge genomic loci, some of which can encode thousands of different piRNAs; (3) piRNAs can be subject to an amplification process that generates additional copies of piRNAs, and (4) the formation of piRNAs does not involve the formation of dsRNA precursors or cleavage by the Dicer ribonuclease. Instead, piRNA biogenesis appears to depend on the endonuclease activity of the PIWI protein acting on a long, single-stranded primary transcript. Those piRNAs that are active in the cell against transposable elements are then amplified in a subsequent step. The mechanism by which piRNAs act to silence target RNA expression remains unclear.

Other Noncoding RNAs

REVIEW 1. What is an siRNA, an miRNA and a piRNA? How is each of them formed in the cell? What are their proposed functions?

11.6 | Encoding Genetic Information Once the structure of DNA had been revealed in 1953, it became evident that the sequence of amino acids in a polypeptide was specified by the sequence of nucleotides in the DNA of a gene. It seemed very unlikely that DNA could serve as a direct, physical template for the assembly of a protein. Instead, it was presumed that the information stored in the nucleotide sequence was present in some type of genetic code. With the discovery of messenger RNA as an intermediate in the flow of information from DNA to protein, attention turned to the manner in which a sequence written in a ribonucleotide “alphabet” might code for a sequence in an “alphabet” consisting of amino acids.

The Properties of the Genetic Code One of the first models of the genetic code was presented by the physicist George Gamow, who proposed that each amino acid in a polypeptide was encoded by three sequential nu-

11.6 Encoding Genetic Information

As the field of cell and molecular biology has matured, we have gradually learned to appreciate the remarkable structural and functional diversity of RNA molecules. New types of RNAs have been discovered frequently over the past few decades, as described in the previous pages of this text. But researchers have been unprepared for one flurry of findings, which indicate that at least two-thirds of a mouse or human genome is normally transcribed, far more than would be expected based on the number of “meaningful” DNA sequences thought to be present. In fact, most of the sequences included among this transcribed DNA are normally thought of as “junk,” including large numbers of transposable elements that constitute a sizeable fraction of mammalian genomes (page 411). Some studies raise the question whether there is any basic difference between a gene and an intergenic region; they report that transcription can begin at all kinds of unexpected sites within the genome, and that transcripts overlap extensively with one another. Other studies reveal that cells often transcribe both strands of a DNA element, producing both a sense RNA and an antisense RNA. Some antisense RNAs are synthesized by polymerases that had initially bound to the promoter of a protein-coding gene and then moved upstream from that site along the opposite strand of the chromosome, that is, in the opposite direction from polymerase molecules that are transcribing the gene itself. Just as the DNA of a cell or organism constitutes its genome and the proteins it produces constitutes it proteome, the RNAs that are synthesized by a cell or organism constitutes its transcriptome. Why is the transcriptome of a mammalian cell so large? In other words, why does a cell transcribe all of these various types of DNA sequences? We don’t know the answer to this basic question. According to one viewpoint, much of this pervasive transcription activity is simply the “background noise” that accompanies the complex process of gene expression. Proponents of this viewpoint cite the results of studies in mice in which blocks of the genome that lack protein-

coding genes have been deleted. It was found in these studies that healthy mice can develop even though they cannot synthesize the noncoding RNAs (ncRNAs) that would normally have been produced from the deleted DNA sequences. According to a contrasting viewpoint, much of the noncoding RNA being produced is involved in diverse regulatory activities that have yet to be identified. Supporters of this position cite results suggesting that this pervasive transcription is not random; many of the noncoding transcripts display distinct and reproducible patterns of tissue-specific distribution, and their origin can be traced to specific loci in the genome. Keep in mind that it has only been in the last decade or so that the existence of siRNAs, miRNAs, and piRNAs has come to light, and it is likely that other types of small ncRNAs remain undetected. Not all noncoding transcripts are small. A number of evolutionarily conserved, long (⬎200 bases) ncRNAs (e.g., XIST, HOTAIR, and AIRE) that function as regulators of chromatin structure or gene transcription have also been discovered, as discussed in Sections 12.2, 12.3, and 17.4, respectively. Some studies suggest that there are thousands of conserved long ncRNAs (lncRNAs), and thus the few that have been identified are only the tip of the iceberg. It is even possible that the vast transcriptional output of the mammalian genome holds the key to explaining why we have approximately the same number of genes as organisms we believe to be much less complex (see Figure 10.27). Regardless of the actual explanation, one point is clear: there is a great deal that we do not understand about the many roles of RNAs in eukaryotic cells.

462

cleotides. In other words, the code words, or codons, for amino acids were nucleotide triplets. Gamow arrived at this conclusion by a bit of armchair logic. He reasoned that it would require at least three nucleotides for each amino acid to have its own unique codon. Consider the number of words that can be spelled using an alphabet containing four different letters corresponding to the four possible bases that can be present at a particular site in DNA (or mRNA). There are 4 possible one-letter words, 16 (42) possible two-letter words, and 64 (43) possible three-letter words. Because there are 20 different amino acids (words) that have to be specified, codons must contain at least 3 successive nucleotides (letters). The triplet nature of the code was soon verified in a number of insightful genetic experiments conducted by Francis Crick, Sydney Brenner, and colleagues at Cambridge University.6 In addition to proposing that the code was triplet, Gamow also suggested that it was overlapping. Even though this suggestion proved to be incorrect, it raises an interesting question concerning the genetic code. Consider the following sequence of nucleotides:

Chapter 11 Gene Expression: From Transcription to Translation

¬AGCAUCGCAUCGA¬ If the code is overlapping, then the ribosome would move along the mRNA one nucleotide at a time, recognizing a new codon with each move. In the preceding sequence, AGC would specify an amino acid, GCA would specify the next amino acid, CAU the next, and so on. In contrast, if the code is nonoverlapping, each nucleotide along the mRNA would be part of one, and only one, codon. In the preceding sequence, AGC, AUC, and GCA would specify successive amino acids. A conclusion as to whether the genetic code was overlapping or nonoverlapping could be inferred from studies of mutant proteins, such as the mutant hemoglobin responsible for sickle cell anemia. In sickle cell anemia, as in most other cases that were studied, the mutant protein was found to contain a single amino acid substitution. If the code is overlapping, a change in one of the base pairs in the DNA would be expected to affect three consecutive codons (Figure 11.39) and, there6

Anyone looking to read a short research paper that conveys a feeling for the inductive power and elegance of the early groundbreaking studies in molecular genetics can find these experiments on the genetic code in Nature 192:1227, 1961. (See Cell 128:815, 2007 for discussion of this classic experiment.)

fore, three consecutive amino acids in the corresponding polypeptide. If, however, the code is nonoverlapping and each nucleotide is part of only one codon, then only one amino acid replacement would be expected. These and other data indicated that the code is nonoverlapping. Given a triplet code that can specify 64 different amino acids and the reality that there are only 20 amino acids to be specified, the question arises as to the function of the extra 44 triplets. If any or all of these 44 triplets code for amino acids, then at least some of the amino acids would be specified by more than one codon. A code of this type is said to be degenerate. As it turns out, the code is highly degenerate, as nearly all of the 64 possible codons specify amino acids. Those that do not (3 of the 64) have a special “punctuation” function— they are recognized by the ribosome as termination codons and cause reading of the message to stop. The degeneracy of the code was originally predicted by Francis Crick on theoretical grounds when he considered the great range in base composition among the DNAs of various bacteria. It had been found, for example, that the G ⫹ C content could range from 20 percent to 74 percent of the genome, whereas the amino acid composition of the proteins from these organisms showed little overall variation. This suggested that the same amino acids were being encoded by different base sequences, which would make the code degenerate. Identifying the Codons By 1961, the general properties of the code were known, but not one of the coding assignments of the specific triplets had been discovered. At the time, most geneticists thought it would take many years to decipher the entire code. But a breakthrough was made by Marshall Nirenberg and Heinrich Matthaei who used an enzyme polynucleotide phosphorylase to synthesize their own artificial genetic messages and then determine what kind of protein they encoded. The first message they tested was a polyribonucleotide consisting exclusively of uridine; the message was called poly(U). When poly(U) was added to a test tube containing a bacterial extract with all 20 amino acids and the materials necessary for protein synthesis (ribosomes and various soluble factors), the system followed the artificial messenger’s instructions and manufactured a polypeptide. The assembled polypeptide was analyzed and found to be polyphenylalanine—a polymer of the amino acid phenylalanine. Nirenberg

Codons

Base Sequence Original sequence . . . AGCATCG . . .

Overlapping code . . ., AGC GCA CAT ATC TCG . . .

Nonoverlapping code . . ., AGC ATC . . .

Sequence after single-base substitution . . . AGAATCG . . .

Overlapping code . . ., AGA GAA AAT ATC TCG . . .

Nonoverlapping code . . ., AGA ATC . . .

Figure 11.39 The distinction between an overlapping and a nonoverlapping genetic code. The effect on the information content of an mRNA by a single base substitution depending on whether the

code is overlapping or nonoverlapping. The affected codons are underlined in red.

463

and Matthaei had thus shown that the codon UUU specifies phenylalanine. Over the next four years, the pursuit was joined by a number of laboratories, and synthetic mRNAs were constructed to test the amino acid specifications for all 64 possible codons. The result was the universal decoder chart for the genetic code shown in Figure 11.40. The decoder chart lists the nucleotide sequence for each of the 64 possible codons in an mRNA. Instructions for reading the chart are provided in the accompanying figure legend.7 7

As described on page 468, ribosomes always identify an AUG as the codon to start polypeptide synthesis. In these early experiments with synthetic polynucleotides, the steps using bacterial extracts to synthesize polypeptides were carried out at unusually high Mg2⫹ concentrations, which allowed the ribosome to initiate translation at any codon.

The first exceptions to the assignments of the codons shown in Figure 11.40 were found to occur in mitochondrial mRNAs. For example, in human mitochondria, UGA is read as tryptophan rather than stop, AUA is read as methionine rather than isoleucine, and AGA and AGG are read as stop rather than arginine. More recently, exceptions have been found, here and there, in the nuclear DNA codons of protists and fungi. It is evident, however, that these minor deviations have evolved as secondary changes from the standard genetic code and that the genetic code shown in Figure 11.40 can be considered to be universal, that is, present in all organisms. We can further conclude that all known organisms present on Earth today share a common evolutionary origin. Examination of the codon chart in Figure 11.40 indicates that the amino acid assignments are distinctly nonrandom. If one looks for the codon boxes for a specific amino acid, they

1st letter 2nd letter U

C

A

G

Phenylalanine

Serine

Tyrosine

Cysteine

U

Phenylalanine

Serine

Tyrosine

Cysteine

C

Leucine

Serine

stop

stop

A

Leucine

Serine

stop

Tryptophan

G

Leucine

Proline

Histidine

Arginine

U

Leucine

Proline

Histidine

Arginine

C

Leucine

Proline

Glutamine

Arginine

A

Leucine

Proline

Glutamine

Arginine

G

Isoleucine

Threonine

Asparagine

Serine

U

Isoleucine

Threonine

Asparagine

Serine

C

Isoleucine

Threonine

Lysine

Arginine

A

(start) Methionine

Threonine

Lysine

Arginine

G

Valine

Alanine

Aspartic acid

Glycine

U

Valine

Alanine

Aspartic acid

Glycine

C

Valine

Alanine

Glutamic acid

Glycine

A

Valine

Alanine

Glutamic acid

Glycine

G

Examples of tRNAs cys

U

C

A

G

ACG Codon: U G C

his

GUG Codon: C A C

gly

3rd letter

Figure 11.40 The genetic code. This universal decoder chart lists each of the 64 possible mRNA codons and the corresponding amino acid specified by that codon. To use the chart to translate the codon UGC, for example, find the first letter (U) in the indicated row on the left. Follow that row to the right until you reach the second letter (G) indicated at the top; then find the amino acid that matches the third letter (C) in the row on the right. UGC specifies the insertion of cysteine. Each amino acid (except two) has two or more codons that order its insertion, which makes the code degenerate. A given amino acid

tends to be encoded by related codons. This feature reduces the likelihood that base substitutions will result in changes in the amino acid sequence of a protein. Amino acids with similar properties also tend to be clustered. Amino acids with acidic side chains are shown in red, those with basic side chains in blue, those with polar uncharged side chains in green, and those with hydrophobic side chains in brown. As discussed in the following section, decoding in the cell is carried out by tRNAs, a few of which are illustrated schematically in the right side of the figure. UAA, UAG, and UGA are read as stop codons.

11.6 Encoding Genetic Information

CCU Codon: G G A

464 Base Sequence of RNA Original sequence . . . AGC AUC UGU . . .

Sequence after single-base substitution (a) . . . AGC AUA UGU . . .

Synonymous mutation

(b) . . . AGC AUG UGU . . .

Nonsynonymous mutation

(c) . . . AGC AUC UGA . . .

Nonsense mutation

Sequences after insertion or deletion of one or two bases (d) . . . AGC CAU CUG U . . . . .

Frameshift mutations

. . . AGU GCA UCU GU . . . . . . . AGC UCU GU . . . . . . .

Chapter 11 Gene Expression: From Transcription to Translation

. . . AGC CUG U . . . . . . . .

Figure 11.41 Genes can be subject to several types of mutations. Red lines denote codons. (a) Synonymous mutations (in this case, AUC to AUA) do not affect an amino acid coding assignment, but they can still have a phenotypic effect by altering pre-mRNA splicing events, translation efficiency, mRNA stability, and so on (b) Nonsynonymous (or missense) mutations (in this case, AUC to AUG) cause a change of a single amino acid in the polypeptide sequence.

(c) Nonsense mutations (or premature termination codons) (in this case UGU to UGA) cause the ribosome to stop translating the mRNA at the site of mutation, thereby terminating protein synthesis prematurely. (d) Frameshift mutations, which can be caused by the insertion or deletion of one or two bases, move the ribosome into an incorrect reading frame, causing an abnormal amino acid sequence from the point of mutation onward.

tend to be clustered within a particular portion of the chart. Clustering reflects the similarity in codons that specify the same amino acid. As a result of this similarity in codon sequence, spontaneous mutations causing single base changes in a gene often will not produce a change in the amino acid sequence of the corresponding protein. A change in nucleotide sequence that does not affect amino acid sequence is said to be synonymous, whereas a change that causes an amino acid substitution is said to be nonsynonymous. These two types of mutations are depicted in Figure 11.41, which describes several of the different types of mutations discussed in this chapter. Synonymous changes (Figure 11.41a) are much less likely to change an organism’s phenotype than are nonsynonymous changes (Figure 11.41b). Consequently, nonsynonymous changes are much more likely to be selected for or against by natural selection. Now that we have sequenced the genomes of related organisms, such as those of chimpanzees and humans, we can look directly at the sequences of homologous genes and see how many changes are synonymous or nonsynonymous. Genes possessing an excess of nonsynonymous substitutions in their coding regions are likely to have been influenced by natural selection (see the footnote on page 413). The “safeguard” aspect of the code goes beyond its degeneracy. Codon assignments are such that similar amino acids tend to be specified by similar codons. For example, codons of the various hydrophobic amino acids (depicted as brown boxes in Figure 11.40) are clustered in the first two columns of the chart. Consequently, a mutation that results in a base substitution in one of these codons is likely to substitute one hydrophobic residue for another. In addition, the greatest similarities between amino acid–related codons occur in the first two nucleotides of the triplet, whereas the greatest vari-

ability occurs in the third nucleotide. For example, glycine is encoded by four codons, all of which begin with the nucleotides GG. An explanation for this phenomenon is revealed in the following section in which the role of the transfer RNAs is described.

REVIEW 1. Explain what is meant by stating that the genetic code is triplet and nonoverlapping? What did the finding that DNA base compositions varied greatly among different organisms suggest about the genetic code? How was the identity of the UUU codon established? 2. Distinguish between the effects of a base substitution in the DNA on a nonoverlapping and an overlapping code. 3. Distinguish between a synonymous and a nonsynonymous base change.

11.7 | Decoding the Codons: The Role of Transfer RNAS Nucleic acids and proteins are like two languages written with different types of letters. This is the reason protein synthesis is referred to as translation. Translation requires that information encoded in the nucleotide sequence of an mRNA be decoded and used to direct the sequential assembly of amino acids into a polypeptide chain. Decoding the information in an mRNA is accomplished by transfer RNAs, which act as adaptors. On one hand, each tRNA is linked to a specific amino acid (as an aa-

465

tRNA), while on the other hand, that same tRNA is able to recognize a particular codon in the mRNA. The interaction between successive codons in the mRNA and specific aa-tRNAs leads to the synthesis of a polypeptide with an ordered sequence of amino acids. To understand how this occurs, we must first consider the structure of tRNAs.

The Structure of tRNAs In 1965, after seven years of work, Robert Holley of Cornell University reported the first base sequence of an RNA molecule, that of a yeast transfer RNA that carries the amino acid alanine (Figure 11.42a). This tRNA is composed of 77 nucleotides, 10 of which are modified from the standard 4 nucleotides of RNA (A, G, C, U), as indicated by the shading in the figure. Over the following years, other tRNA species were purified and sequenced, and a number of distinct similarities present in all the different tRNAs became evident (Figure 11.42b). All tRNAs were roughly the same length—between 73 and 93 nucleotides—and all had a significant percentage of unusual bases that were found to result from enzymatic modifications of one of the four standard bases after it had been incorporated into the RNA chain, that is, posttranscriptionally. In addition, all tRNAs had sequences of nucleotides in one part of the molecule that were complementary to sequences located in other parts of the molecule. Because of these complementary se-

(a)

Anticodon

Figure 11.42 Two-dimensional structure of transfer RNAs. (a) The nucleotide sequence of the cloverleaf form of a yeast tRNAAla. The amino acid becomes linked to the 3⬘ end of the tRNA, whereas the opposite end bears the anticodon, in this case IGC. The function of the anticodon is discussed later. In addition to the four bases, A, U, G, and C, this tRNA contains ␺, pseudouridine; T, ribothymidine; mI, methylinosine; I, inosine; me2G, dimethylguanosine; D, dihydrouridine; and meG, methylguanosine. The sites of the ten

5'

P • • •

• • D arm U • • • R A • • • Y • G • • R • • G A • • • Anticodon • arm • • Y U

(b)

OH A 3' C C • • • Amino acid • acceptor stem • • • •

T arm

Y • A • • • • C R • • • • G T ψ C • Y • • • • • • •

Variable arm

• R • • •

Anticodon

modified bases in this tRNA are indicated by the color shading. (b) Generalized representation of tRNA in the cloverleaf form. The bases common to all tRNAs (both prokaryotic and eukaryotic) are indicated by letters; R is an invariant purine, Y is an invariant pyrimidine, ␺ is an invariant pseudouridine. The greatest variability among tRNAs occurs in the V (variable) arm, which can range from 4 to 21 nucleotides. There are two sites of minor variability in the length of the D arm.

11.7 Decoding the Codons: The Role of Transfer RNAS

OH A Site for amino C acid attachment C A 5'p G C G C G U C G G C U U G C U U A D G U A U G C G mG A G G C C C G U C C G G A G C G C G C m2G T ψ C D D G C G A G G U A C G C G C G U ψ U ml I G C

quences, the various tRNAs become folded in a similar way to form a structure that can be drawn in two dimensions as a cloverleaf. The base-paired stems and unpaired loops of the tRNA cloverleaf are shown in Figures 11.42 and 11.43. The unusual bases, which are concentrated in the loops, disrupt hydrogen bond formation in these regions and serve as potential recognition sites for various proteins. All mature tRNAs have the triplet sequence CCA at their 3⬘ end. These three nucleotides may be encoded in the tRNA gene (in many prokaryotes) or added enzymatically (in eukaryotes). In the latter case, a single talented enzyme adds all three nucleotides in the proper order without the benefit of a DNA or RNA template. Up to this point we have considered only the secondary, or two-dimensional, structure of these adaptor molecules. Transfer RNAs fold into a unique and defined tertiary structure. X-ray diffraction analysis has shown that tRNAs are constructed of two double helices arranged in the shape of an L (Figure 11.43b). Those bases that are found at comparable sites in all tRNA molecules (the invariant bases of Figure 11.42b) are particularly important in generating the L-shaped tertiary structure. The common shapes of tRNAs reflects the fact that all of them take part in a similar series of reactions during protein synthesis. However, each tRNA has unique features that distinguish it from other tRNAs. As discussed in the following section, it is these unique features that make it possible for an amino acid to become attached enzymatically to the appropriate (cognate) tRNA.

466 3'

5'

D arm D G

D

A C

G

A G

Chapter 11 Gene Expression: From Transcription to Translation

(a)

G

G

G C G G A U U U U C m2G A

A OH C Amino acid C acceptor arm A C G C G U U T arm A C U

A G A C AC m5 C U G U G A G C2 m2G C U T C G A m7G G C G A U G m5C A ψ A mC Y U mG A A

A G

ψ

C

Anticodon

(b)

Figure 11.43 The structure of a tRNA. (a) Two-dimensional structure of a yeast phenylalanyl-tRNA with the various regions of the molecule color-coded to match part b. (b) Three-dimensional structure of tRNAPhe derived from X-ray crystallography. The amino acceptor (AA) arm and the T␺C (T) arm form a continuous double helix, and

the anticodon (AC) arm and the D arm form a partially continuous double helix. These two helical columns meet to form an L-shaped molecule. (B: COURTESY OF MICHAEL CARSON, UNIVERSITY OF ALABAMA AT BIRMINGHAM.)

Transfer RNAs translate a sequence of mRNA codons into a sequence of amino acid residues. Information contained in the mRNA is decoded through the formation of base pairs between complementary sequences in the transfer and messenger RNAs (see Figure 11.49). Thus, as with other processes involving nucleic acids, complementarity between base pairs lies at the heart of the translation process. The part of the tRNA that participates in this complementary interaction with the codon of the mRNA is a stretch of three sequential nucleotides, called the anticodon, that is located in the middle loop of the tRNA molecule (Figure 11.43a). This loop is invariably composed of seven nucleotides, the middle three of which constitute the anticodon. The anticodon is located at one end of the L-shaped tRNA molecule opposite from the end at which the amino acid is attached (Figure 11.43b). Given the fact that 61 different codons can specify an amino acid, a cell might be expected to have at least 61 different tRNAs, each with a different anticodon complementary to one of the codons of Figure 11.40. But recall that the greatest similarities among codons that specify the same amino acid occur in the first two nucleotides of the triplet, whereas the greatest variability in these same codons occurs in the third nucleotide of the triplet. Consider the 16 codons ending in U. In every case, if the U is changed to a C, the same amino acid is specified (first two lines of each box in Figure 11.40).

Similarly, in most cases, a switch between an A and a G at the third site is also without effect on amino acid determination. The interchangeability of the base of the third position led Francis Crick to propose that the same transfer RNA may be able to recognize more than one codon. His proposal, termed the wobble hypothesis, suggested that the steric requirement between the anticodon of the tRNA and the codon of the mRNA is very strict for the first two positions but is more flexible at the third position. As a result, two codons that specify the same amino acid and differ only at the third position should use the same tRNA in protein synthesis. Once again, a Crick hypothesis proved to be correct. The rules governing the wobble at the third position of the codon are as follows (Figure 11.44): U of the anticodon can pair with A or G of the mRNA; G of the anticodon can pair with U or C of the mRNA; and I (inosine, which is derived from guanine in the original tRNA molecule) of the anticodon can pair with U, C, or A of the mRNA. As a result of the wobble, the six codons for leucine, for example, require only three tRNAs. Amino Acid Activation It is critically important during polypeptide synthesis that each transfer RNA molecule be attached to the correct (cognate) amino acid. Amino acids are covalently linked to the 3⬘ ends of their cognate tRNA(s) by an enzyme called an aminoacyl-tRNA synthetase (aaRS)

467 Leu

Leu

Leu

5'P

G C mRNA codons 5'

A U

U A

Leu

5'P

G C

A U

IIe

5'P

U G

G C 3'

A U

G U

5'

IIe

5'P

G C

A U

IIe

5'P

G C

U A 3'

5'

A U

I U

5'P

U A

A U

I C

5'P

U A

A U

I A 3'

Figure 11.44 The wobble in the interaction between codons and anticodons. In some cases, the nucleotide at the 5⬘ end of the tRNA anticodon is capable of pairing with more than one nucleotide at the 3⬘

end (third position) of the mRNA codon. Consequently, more than one codon can use the same tRNA. The rules for pairing in the wobble scheme are indicated in the figure as well as in the text.

(Figure 11.45). Although there are many exceptions, organisms typically contain 20 different aminoacyl-tRNA synthetases, one for each of the 20 amino acids incorporated into proteins. Each of the synthetases is capable of “charging” all of the tRNAs that are appropriate for that amino acid (i.e., any tRNA whose anticodon recognizes one of the various codons specifying that amino acid, as indicated in Figure 11.40). Aminoacyl-tRNA synthetases provide an excellent example of the specificity of protein–nucleic acid interactions. Certain common features must exist among all tRNA species specifying a given amino acid to allow a single aminoacyl-tRNA synthetase to recognize all of these tRNAs while, at the same time, discriminating against all of the tRNAs for other amino acids. Information concerning the structural features of tRNAs that cause them to be selected or rejected as substrates has come primarily from two sources:

enzyme. This is the primary energy-requiring step in the chemical reactions leading to polypeptide synthesis. Subsequent events, such as the transfer of the amino acid to the tRNA molecule (step 2 above), and eventually to the growing polypeptide chain, are thermodynamically favorable. The PPi produced in the first reaction is subsequently hydrolyzed to Pi, further driving the overall reaction toward the formation of products (page 431). As we will see later, energy is expended during protein synthesis, but it is not used in the formation of peptide bonds. In the second step, the enzyme transfers its bound amino acid to the 3⬘ end of a cognate tRNA. Among the 20 amino acids incorporated into proteins, some are quite similar structurally to others. The synthetases that catalyze reactions dealing with these “similar-looking” amino acids typically possess a second editing site in addition to the catalytic site. Should the synthetase happen to place an inappropriate amino acid on a tRNA in the catalytic site, a proof-

1. Determination of the three-dimensional structure of

Aminoacyl-tRNA synthetases carry out the following two-step reaction: 1st step: ATP ⫹ amino acid S aminoacyl-AMP ⫹ PPi 2nd step: aminoacyl-AMP ⫹ tRNA S aminoacyl-tRNA ⫹ AMP In the first step, the energy of ATP activates the amino acid by formation of an adenylated amino acid, which is bound to the

Figure 11.45 Three-dimensional portrait of the interaction between a tRNA and its aminoacyl-tRNA synthetase. The crystal structure of E. coli glutaminyl-tRNA synthetase (in blue) complexed with tRNAGln (in red and yellow). The enzyme recognizes this specific tRNA and discriminates against others through interactions with the acceptor stem and anticodon of the tRNA. (FROM THOMAS A. STEITZ, SCIENCE VOL. 246, COVER OF 12/1/89 ISSUE; © 1989, REPRINTED WITH PERMISSION FROM AAAS.)

11.7 Decoding the Codons: The Role of Transfer RNAS

these enzymes by X-ray crystallography, which allows investigators to identify which sites on the tRNA make direct contact with the protein. As illustrated in Figure 11.45, the two ends of the tRNA—the acceptor stem and anticodon—are particularly important for recognition by most of these enzymes. 2. Determination of the changes in a tRNA that cause the molecule to be aminoacylated by a noncognate synthetase. It is found, for example, that a specific base pair in a tRNAAla (the G-U base pair involving the third G from the 5⬘ end of the molecule in Figure 11.42a) is the primary determinant of its interaction with the alanyltRNA synthetase. Insertion of this specific base pair into the acceptor stem of a tRNAPhe or a tRNACys is sufficient to cause these tRNAs to be recognized by alanyl-tRNA synthetase and to be aminoacylated with alanine.

468

reading mechanism of the enzyme is activated and the bond between the amino acid and tRNA is severed in the editing site. Several methods have been developed that allow investigators to synthesize proteins, either in the test tube or within cells, that contain unnatural amino acids, that is, amino acids other than the 20 normally incorporated by the translation machinery. These unnatural amino acids are not generated by modification of normal amino acids after their incorporation into the polypeptide but are directly encoded within the mRNA. This “expansion” of the genetic code generally involves the use of an experimentally modified tRNA that recognizes one of the stop codons and a cognate amino acyl-tRNA synthetase that specifically recognizes the unnatural amino acid. The unnatural amino acid is then incorporated into the polypeptide chain wherever a particular stop codon is encountered within an mRNA. Using these methods, researchers can synthesize a protein containing chemical groups capable of reporting on the protein’s activities, or they can design proteins with novel structures and functions for use as potential drugs or for other commercial applications.

REVIEW 1. Why are tRNAs referred to as adaptor molecules? What aspects of their structure do tRNAs have in common? 2. Describe the nature of the interaction between tRNAs and aminoacyl-tRNA synthetases. Describe the nature of the interaction between tRNAs and mRNAs. What is the wobble hypothesis?

Chapter 11 Gene Expression: From Transcription to Translation

11.8 | Translating Genetic Information Protein synthesis, or translation, may be the most complex synthetic activity in a cell. The assembly of a protein requires all the various tRNAs with their attached amino acids, ribosomes, a messenger RNA, numerous proteins having different functions, cations, and GTP. The complexity is not surprising considering that protein synthesis requires the incorporation of 20 different amino acids in the precise sequence dictated by a coded message written in a language that uses different characters. In the following discussion, we will draw most heavily on translation mechanisms as they operate in bacterial cells, which is simpler and better understood. The process is remarkably similar in eukaryotic cells. The synthesis of a polypeptide chain can be divided into three rather distinct activities: initiation of the chain, elongation of the chain, and termination of the chain. We will consider each of these activities in turn.

Initiation Once it attaches to an mRNA, a ribosome always moves along the mRNA from one codon to the next, that is, in consecutive blocks of three nucleotides. To ensure that the proper triplets are read, the ribosome attaches to the mRNA at a precise site,

termed the initiation codon, which is specified as AUG. Binding to this codon automatically puts the ribosome in the proper reading frame so that it correctly reads the entire message from that point on. For example, in the following case, ¬CUAGUUACAUGCUCCAGUCCGU¬ the ribosome moves from the initiation codon, AUG, to the next three nucleotides, CUC, then to CAG, and so on along the entire line. The basic steps in the initiation of translation in bacterial cells are illustrated in Figure 11.46. Step 1: Bringing the Small Ribosomal Subunit to the Initiation Codon As seen in Figure 11.46, an mRNA does not bind to an intact ribosome, but to the small and large subunits in separate stages. The first major step of initiation is the binding of the small ribosomal subunit to the first AUG sequence (or one of the first) in the message, which serves as the initiation codon.8 How does the small subunit select the initial AUG codon as opposed to some internal one? Bacterial mRNAs possess a specific sequence of nucleotides (called the Shine-Dalgarno sequence after its discoverers) that resides 5 to 10 nucleotides before the initiation codon. The Shine-Dalgarno sequence is complementary to a sequence of nucleotides near the 3⬘ end of the 16S ribosomal RNA of the small ribosomal subunit.

Interaction between these complementary sequences on the mRNA and rRNA positions the 30S subunit at the AUG initiation codon. Initiation Requires Initiation Factors Several of the steps outlined in Figure 11.46 require the help of soluble proteins, called initiation factors (designated as IFs in bacteria and eIFs in eukaryotes). Bacterial cells require three initiation factors—IF1, IF2, and IF3—which attach to the 30S subunit (step 1, Figure 11.46). IF2 is a GTP-binding protein required for attachment of the first aminoacyl-tRNA. IF3 may prevent the large (50S) subunit from joining prematurely to the 30S subunit, and also facilitate entry of the appropriate initiator aa-tRNA. IF1 is thought to promote attachment of the 30S subunit to the mRNA and may prevent the aa-tRNA from entering the wrong site on the ribosome. Step 2: Bringing the First aa-tRNA into the Ribosome If one examines the codon assignments (Figure 11.40), it can be seen that AUG is more than just an initiation codon; it is also the only codon for methionine. In fact, methionine is always the first amino acid to be incorporated at the N-terminus of a nascent polypeptide chain. (In bacteria, the initial methionine bears a formyl group, making it N-formylmethionine.) The methionine (or N-formylmethionine) is subse8

GUG is also capable of serving as an initiation codon and is found as such in a few natural messages. When GUG is used, N-formylmethionine is still utilized to form the initiation complex despite the fact that internal GUG codons specify valine.

469

IF1

quently removed enzymatically from the majority of newly synthesized proteins. Cells possess two distinct methionyltRNAs: one used to initiate protein synthesis and a different one to incorporate methionyl residues into the interior of a polypeptide. The initiator aa-tRNA is positioned by the IF2 initiation factor within the P site of the ribosome (step 2, Figure 11.46), where the anticodon loop of the tRNA binds to the AUG codon of the mRNA. IF1 and IF3 are released.

IF2 GTP

IF3

5'

Step 3: Assembling the Complete Initiation Complex Once the initiator tRNA is bound to the AUG codon and IF3 is displaced, the large subunit joins the complex and the GTP bound to IF2 is hydrolyzed (step 3, Figure 11.46). GTP hydrolysis probably drives a conformational change in the ribosome that is required for the release of IF2-GDP.

mRNA 1 IF2 IF1

GTP

IF3

A U G 5' mRNA

+

Initiation of Translation in Eukaryotes Eukaryotic ribosomes contain larger rRNAs and additional proteins, and eukaryotic mRNAs possess specialized features, such as 5’ caps and 3’ tails. Not surprisingly, initiation of translation in eukaryotes is quite different and more complex than the corresponding process in bacteria and, moreover, it is an important step at which control over gene expression is exerted (see Section 12.6). Eukaryotic cells require at least 12 initiation factors comprising a total of more than 25 polypeptide chains. As indicated in Figure 11.47, several of these eIFs (e.g., eIF1, eIF1A, eIF5, and eIF3) bind to the 40S subunit, which prepares the subunit for binding to the mRNA. The initiator tRNA linked to a methionine also binds to the 40S subunit prior to its interaction with the mRNA. The initiator tRNA enters the P site of the subunit in association with eIF2-GTP. Once these events have occurred, the small ribosomal subunit with its associated initiation factors and charged tRNA (which together form a 43S preinitiation complex) is ready to find the 5⬘ end of the mRNA, which bears the methylguanosine cap (page 448). The 43S complex is originally recruited to the mRNA with the help of a cluster of initiation factors that are already bound to the mRNA (Figure 11.47). Among these factors: (1)

fMet

tRNAfMet U A C 2

IF2

IF1

U A C A U G

IF3

5' mRNA IF3

3

E

IF1

P

A

tRNA

U A C A U G

eIF3

eIF2-GTP Met eIF1A eIF1

5' Pi

mRNA

+

IF2

40S subunit GDP

+

mRNA

eIF4G 5' eIF4A eIF4E

AUG

mRNA complex

Figure 11.47 Initiation of protein synthesis in eukaryotes. As discussed in the text, initiation begins with the union of two complexes. One (called the 43S complex) contains the 40S ribosomal subunit bound to several initiation factors (eIFs) and the initiator tRNA, whereas the other contains the mRNA bound to a separate group of initiation factors. This union is mediated by an interaction between eIF3 on the 43S complex and eIF4G on the mRNA complex. eIF1 and eIF1A are thought to induce a conformational change in the small ribosomal subunit that opens a “latch” to accommodate the entry of the mRNA. Once the 43S complex has bound to the 5⬘ end of the mRNA, it scans along the message until it reaches the appropriate AUG initiation codon.

11.8 Translating Genetic Information

Figure 11.46 Initiation of protein synthesis in bacteria. In step 1, initiation of translation begins with the association of the 30S ribosomal subunit with the mRNA at the AUG initiation codon, a step that requires IF1 and IF3. The 30S ribosomal subunit binds to the mRNA at the AUG initiation codon as the result of an interaction between a complementary nucleotide sequence on the rRNA and mRNA, as discussed in the text. In step 2, the formylmethionyl-tRNAfMet becomes associated with the mRNA and the 30S ribosomal subunit complex by binding to IF2-GTP. In step 3, the 50S subunit joins the complex, GTP is hydrolyzed, and IF2-GDP is released. The initiator tRNA enters the P site of the ribosome, whereas all subsequent tRNAs enter the A site (see Figure 11.49).

43S complex

3' A A A A A A A A A A A Poly (A) binding protein (PABP)

470

Chapter 11 Gene Expression: From Transcription to Translation

eIF4E binds to the 5⬘ cap of the eukaryotic mRNA; (2) eIF4A moves along the 5⬘ end of the message removing any doublestranded regions that would interfere with the movement of the 43S complex along the mRNA; and (3) eIF4G serves as a linker between the 5⬘ capped end and the 3⬘ polyadenylated end of the mRNA (Figure 11.47). Thus EIF4G, in effect, converts a linear mRNA into a circular message. Once the 43S complex binds to the 5⬘ end of the mRNA, the complex then scans along the message until it reaches a recognizable sequence of nucleotides (typically 5⬘—CCACCAUGC—3⬘) that contains the AUG initiation codon. Once the 43S complex reaches the appropriate AUG codon, the GTP bound to eIF2 is hydrolyzed, and the large (60S) subunit joins the complex. The formation of the complete 80S ribosome requires the activity of another GTP-binding protein, called eIF5B-GTP, whose bound GTP is also hydrolyzed. Hydrolysis of the two bound GTPs and formation of the complete ribosome are accompanied by the release of all of the initiation factors. These events leave the anticodon of the initiator tRNA bound to the AUG start codon in the P site of the assembled ribosome. The complex is now ready for the first step in elongation.9,10 The Role of the Ribosome Now that we have reached a point where we have assembled a complete ribosome, we can look more closely at the structure and function of this multisubunit structure. Ribosomes are molecular machines, similar in some respects to the molecular motors described in Chapter 9. During translation, a ribosome undergoes a repetitive cycle of mechanical changes that is driven with energy released by GTP hydrolysis. Unlike myosin or kinesin, which simply move along a physical track, ribosomes move along an mRNA “tape” containing encoded information. Ribosomes, in other words, are programmable machines: the information stored in the mRNA determines the sequence of aminoacyltRNAs that the ribosome accepts during translation. Another feature that distinguishes ribosomes from many other cellular machines is the importance of their component RNAs. Ribosomal RNAs play major roles in selecting tRNAs, ensuring accurate translation, binding protein factors, and polymerizing amino acids (discussed in the Experimental Pathways at the end of the chapter). The past few years have seen great progress in our understanding of the structure of the ribosome. Earlier studies using high-resolution cryoelectron microscopic imaging techniques (Section 18.8) revealed the ribosome to be a highly irregular 9

Not all mRNAs are translated following attachment of the small ribosomal subunit to the 5⬘ end of the message. Many viral mRNAs and a relatively small number of cellular mRNAs, most notably those utilized during mitosis or during periods of stress, are translated as the result of the ribosome attaching to the mRNA at an internal ribosome entry site (IRES), which may be located at some distance from the 5⬘ end of the message. 10 The description presented here pertains to steady-state translation in which an mRNA is translated repeatedly by numerous ribosomes. The first (or pioneer) round of translation has several unique properties. The cap at the 5’ end of the mRNA, for example, is bound by a heterodimeric cap-binding complex (CBC) rather than by the protein elF4E. The differences between the pioneer and steady-state rounds of translation are discussed in Cell 142:368, 2010.

structure with bulges, lobes, channels, and bridges (see Figure 2.56). These studies, carried out primarily by Joachim Frank and colleagues at Columbia University, also provided descriptions of major conformational changes occurring within the ribosome during translation. During the 1990s, major advances were made in the crystallization of ribosomes, and by the end of the decade the first reports of the X-ray crystallographic structure of prokaryotic ribosomes had appeared. Figures 11.48a and b show the overall structure of the two ribosomal subunits of a bacterial ribosome as revealed by Xray crystallography. More recently, X-ray crystallography has been successfully applied to the study of the much larger eukaryotic ribosome, which shares the same core structure and primary functions with their prokaryotic counterparts. Each ribosome has three sites for association with transfer RNA molecules. These sites, termed the A (aminoacyl) site, the P (peptidyl) site, and the E (exit) site, receive each tRNA in successive steps of the elongation cycle, as described in the following section. The positions of tRNAs bound to the A, P, and E sites of both the small and large ribosomal subunit are shown in Figure 11.48a,b. The tRNAs bind within these sites and span the gap between the two ribosomal subunits (Figure 11.48c). The anticodon ends of the bound tRNAs contact the small subunit, which plays a key role in decoding the information contained in the mRNA. In contrast, the amino acid–carrying ends of the bound tRNAs contact the large subunit, which plays a key role in catalyzing peptide bond formation. Other major features revealed by these highresolution structural studies include the following: 1. The interface between the small and large subunits forms

a relatively spacious cavity (Figure 11.48c) that is lined almost exclusively by RNA. The side of the small subunit that faces this cavity is lined along its length by a single continuous double-stranded RNA helix. This helix is shaded in the two-dimensional structure of the 16S rRNA in Figure 11.3. The surfaces of the two subunits that face one another contain the binding sites for the mRNA and incoming tRNAs and, thus, are of key importance for the function of the ribosome. The fact that these surfaces consist largely of RNA supports the proposal that primordial ribosomes were composed exclusively of RNA (page 454). 2. The active site, where amino acids are covalently linked to one another, also consists of RNA. This catalytic portion of the large subunit resides in a deep cleft, which protects the newly formed peptide bond from hydrolysis by the aqueous solvent. 3. The mRNA is situated in a narrow groove that winds around the neck of the small subunit, passing through the A, P, and E sites. Prior to entering the A site, the mRNA is stripped of any secondary structure it might possess by an apparent helicase activity of the ribosome. 4. A tunnel runs completely through the core of the large subunit beginning at the active site. This tunnel provides a passageway for translocation of the elongating polypeptide through the ribosome (Figure 11.48c).

471 5. Most of the proteins of the ribosomal subunits have mul-

tiple RNA-binding sites and are ideally situated to stabilize the complex tertiary structure of the rRNAs.

Elongation The basic steps in the process of elongation of translation in bacterial cells are illustrated in Figure 11.49. This series of steps is repeated over and over as amino acids are polymerized one after another into the growing polypeptide chain.

(a)

(b)

30S

50S

Decoding site

mRNA

Factor binding site

Peptidyl transfer center (b')

(a') Binding sites for tRNA A P E

30S

tRNA 50S 3'

Amino-acid residues

Exit channel (c)

70S

Step 2: Peptide Bond Formation At the end of the first step, the two amino acids, attached to their separate tRNAs, are juxtaposed to one another and precisely aligned to chemically interact (Figure 11.48a⬘,b⬘). The second step in the elongation cycle is the formation of a peptide bond between these two amino acids (step 2, Figure 11.49a). Peptide bond formation is accomplished as the amine nitrogen of the aa-tRNA in the A site carries out a nucleophilic attack on the carbonyl carbon of the amino acid bound to the tRNA of the P site, displacing the P-site tRNA (Figure 11.49b). As a result of this reaction, the tRNA bound to the second codon in the A site has an attached dipeptide, whereas the tRNA in the P site is deacylated. Peptide bond formation occurs spontaneously without the input of external energy. The reaction is catalyzed by peptidyl transferase, a component of the large subunit of the ribosome. For years it was assumed that the peptidyl transferase was one of the proteins of the ribosome. Then, as the catalytic powers of RNA became apparent, attention shifted to the ribosomal RNA as the catalyst for peptide bond formation. It has now been shown that peptidyl transferase activity does indeed reside in the large ribosomal RNA molecule of the large ribosomal subunit (see the chapter-opening image). In other words, peptidyl transferase is a ribozyme, a subject discussed in the Experimental Pathways at the end of the chapter.

11.8 Translating Genetic Information

Figure 11.48 Model of the bacterial ribosome based on X-ray crystallographic data, showing tRNAs bound to the A, P, and E sites of the two ribosomal subunits. (a–b) Views of the 30S and 50S subunits, respectively, with the three bound tRNAs shown at the interface between the subunits. (a⬘–b⬘) Drawings corresponding to the structures shown in parts a and b. The drawing in a⬘ of the 30S subunit shows the approximate locations of the anticodons of the three tRNAs and their interaction with the complementary codons of the mRNA. The drawing in b⬘ of the 50S subunit shows the tRNA sites from the reverse direction. The amino acid acceptor ends of the tRNAs of the A and P sites are in close proximity within the peptidyl transfer site of the subunit, where peptide bond formation occurs. The binding sites for the elongation factors EF-Tu and EF-G are located on the protuberance at the right side of the subunit. (c) Drawing of the 70S bacterial ribosome showing the space between the two subunits that is spanned by each tRNA molecule and the channel within the 50S subunit through which the nascent polypeptide exits the ribosome. (A–B: FROM JAMIE H. CATE ET AL., COURTESY OF HARRY F. NOLLER, SCIENCE 285:2100, 1999; A⬘–B⬘, C: FROM A. LILJAS, SCIENCE 285:2078, 1999; BOTH © 1999, REPRINTED WITH PERMISSION FROM AAAS.)

Step 1: Aminoacyl-tRNA Selection With the charged initiator tRNA in place within the P site, the ribosome is available for entry of the second aminoacyl-tRNA into the vacant A site, which is the first step in elongation (step 1, Figure 11.49a). Before the second aminoacyl-tRNA can effectively bind to the exposed mRNA codon in the A site, it must combine with a protein elongation factor bound to GTP. This particular elongation factor is called EF-Tu (or Tu) in bacteria and eEF1A in eukaryotes. EF-Tu is required to deliver aminoacyl-tRNAs to the A site of the ribosome. Although any aminoacyl-tRNA—Tu-GTP complex can enter the site, only one whose anticodon is complementary to the mRNA codon situated in the A site will trigger the necessary conformational changes within the ribosome that cause the tRNA to remain bound to the mRNA in the decoding center. Once the proper aminoacyl-tRNA—Tu-GTP is bound to the mRNA codon, the GTP is hydrolyzed and the Tu-GDP complex released, leaving the newly arrived aa-tRNA situated in the ribosome’s A site. Regeneration of Tu-GTP from the released Tu-GDP requires another elongation factor, EF-Ts.

472 E

P

H

A Peptide

C

H :N H

C O

R

U A C A U G U U U

H

O

C

O C O

R'

tRNA2

tRNA1

5' P site

EF-Tu

mRNA GTP

Phenylalanine Phe-tRNA

1

aa-tRNA entry A A A

Peptide

EF-Tu

H

O

C

C

GDP

R P

E

A site

H N H

C

tRNA1 (b)

P site

O

R'

A

U A C A A A A U G U U U

O C

tRNA2

+ A site

5' mRNA 2

Peptide bond formation P

E

A

Figure 11.49 Schematic representation of the steps in elongation during translation in bacteria. (a) In step 1, an aminoacyltRNA whose anticodon is complementary to the second codon of the mRNA enters the empty A site of the ribosome. The binding of the tRNA is accompanied by the release of EF-Tu-GDP. In step 2, peptide bond formation is accomplished by the transfer of the nascent polypeptide chain from the tRNA in the P site to the aminoacyl-tRNA of the A site, forming a dipeptidyl-tRNA in the A site and a deacylated tRNA in the P site. The reaction is catalyzed by a part of the rRNA acting as a ribozyme. In step 3, the binding of EF-G and the hydrolysis of its associated GTP results in the translocation of the ribosome relative to the mRNA. Translocation is accompanied by the movement of the deacylated tRNA and peptidyl-tRNA into the E and P sites, respectively. In step 4, the deacylated tRNA leaves the ribosome, and a new aminoacyl-tRNA enters the A site. (b) Peptide bond formation and the subsequent displacement of the deacylated tRNA. A ribosome can catalyze the incorporation of 10 to 20 amino acids into a growing polypeptide per second, which is roughly 10 million times greater than that observed in the uncatalyzed reaction using model substrates in solution.

U A C A A A A U G U U U

5'

mRNA

GTP

EF-G

3

GDP

Translocation EF-G

Chapter 11 Gene Expression: From Transcription to Translation

E

P

A

U A C A A A A U G U U U A G C

5' mRNA

Serine GTP

EF-Tu

U A C

4

GDP

U C G

EF-Tu

Release of deacylated tRNA E

P

A

A A A U C G A U G U U U A G C

5' mRNA (a)

Step 3: Translocation The formation of the first peptide bond leaves one end of the tRNA molecule of the A site still attached to its complementary codon on the mRNA and the

other end of the molecule attached to a dipeptide (step 2, Figure 11.49a). The tRNA of the P site is now devoid of any linked amino acid. The following step, called translocation, is characterized by a small (6⬚) ratchet-like motion of the small subunit relative to the large subunit, which is depicted in Figure 11.50a. As a result of this ratcheting motion, the ribosome moves three nucleotides (one codon) along the mRNA in the 5⬘ S 3⬘ direction (step 3, Figure 11.49a). Translocation is accompanied by the movement of (1) the dipeptidyl-tRNA from the A site to the P site of the ribosome, and (2) the deacylated tRNA from the P site to the E site. As these movements occur, both of these tRNAs remain hydrogen-bonded to their codons in the mRNA. An intermediate stage in the translocation process has been visualized by cryoelectron microscopy, which shows the tRNAs occupying partially translo-

473 ~6°

E

P A

E

P A

hybrid states formation

EF-G binding

ratcheting E P A

E P A

2

1 Classic

E

P A EF-G

E

GTP hydrolysis conformational change

E

P A EF-G

P A

EF-G dissociation

translocation ratcheting

3

(a)

E P A

E P A

E P A 4

5

(b) Ratcheted

2010) (b) The steps that occur during translocation as described in the text. (FROM T. MARTIN SCHMEING AND V. RAMAKRISHNAN, NATURE 461:1238, 2009. REPRINTED WITH PERMISSION FROM MACMILLAN PUBLISHERS LTD. COURTESY VENKI RAMAKRISHNAN.)

cated “hybrid states.” In these hybrid states, the anticodon ends of the tRNAs still reside in the A and P sites of the small subunit, while the acceptor ends of the tRNAs have moved into the P and E sites of the large subunit. The tRNAs are said to occupy A/P and P/E hybrid sites, respectively. The various steps that occur during the process of translocation are shown in Figure 11.50b. The shift from the “classic,” nonratcheted state in step 1, Figure 11.50b, to the hybrid, ratcheted state (step 2) occurs spontaneously, that is, without the involvement of other factors. It appears that the ribosome can spontaneously oscillate back and forth between these two states. Once it has shifted to the hybrid state, a GTP-bound elongation factor (EF-G in bacteria and eEF2 in eukaryotes) binds to the ribosome (step 3), stabilizing the ribosome in the ratcheted state and thereby preventing the movement of the tRNAs back to the classic A/A and P/P conformation. Then, hydrolysis of the bound GTP generates a conformational change that moves the mRNA and associated anticodon loops of the tRNAs relative to the small ribosomal subunit, which places the bound tRNAs in the E/E and P/P states, leaving the A site empty (step 4). At the same time, the ribosome is reset to the nonratcheted state. Following this reaction, EF-G–GDP dissociates from the ribosome (step 5).

For each cycle of elongation, at least two molecules of GTP are hydrolyzed: one during aminoacyl-tRNA selection and one during translocation. Each cycle of elongation takes about one-twentieth of a second, most of which is probably spent sampling aa-tRNAs from the surrounding cytosol. Once the peptidyl-tRNA has moved to the P site by translocation, the A site is once again open to the entry of another aminoacyl-tRNA, in this case one whose anticodon is complementary to the third codon (Figure 11.49a). Once the third charged tRNA is associated with the mRNA in the A site, the dipeptide from the tRNA of the P site is displaced by the aatRNA of the A site, forming the second peptide bond and, as a result, a tripeptide attached to the tRNA of the A site. The tRNA in the P site is once again devoid of an amino acid. Peptide bond formation is followed by translocation of the ribosome to the fourth codon and release of the deacylated tRNA, whereupon the cycle is ready to begin again. We have seen in this section how the ribosome moves three nucleotides (one codon) at a time along the mRNA. The particular sequence of codons in the mRNA that is utilized by a ribosome (i.e., the reading frame) is fixed at the time the ribosome attaches to the initiation codon at the beginning of translation. Some of the most destructive mutations are ones in which a single base pair is either added to or deleted from the DNA. Consider the effect of an addition of one or two nucleotides or a deletion of one or two nucleotides in a given sequence (see Figure 11.41d ). The ribosome moves along the

Step 4: Releasing the Deacylated tRNA In the final step of elongation (step 4, Figure 11.49a), the deacylated tRNA leaves the ribosome, emptying the E site.

11.8 Translating Genetic Information

Figure 11.50 Structural model of the stages of translocation during translational elongation in bacteria (a) Depiction of the shift from the classic to the ratcheted state of the prokaryotic ribosome, which involves a 6⬚ rotation of the subunits relative to one another. (Detailed discussion of this rotation can be found in Ann. Rev. Biophys. 39:227,

474

mRNA in the incorrect reading frame from the point of mutation through the remainder of the coding sequence. Mutations of this type are called frameshift mutations. Frameshift mutations lead to the assembly of an entirely abnormal sequence of amino acids from the point of the mutation. It can be noted that, after more than two decades in which it was assumed that the ribosome always moved from one triplet to the next, several examples were discovered in which the mRNA contains a recoding signal that causes the ribosome to change its reading frame, either backing up one nucleotide (a shift to the ⫺1 frame) or skipping a nucleotide (a shift to the ⫹1 frame). A wide variety of antibiotics have their effect by interfering with various aspects of protein synthesis in bacterial cells. Streptomycin, for example, selectively binds to the small ribosomal subunit of bacterial cells, causing certain of the codons of the mRNA to be misread, thus increasing the synthesis of aberrant proteins. Because the antibiotic does not bind to eukaryotic ribosomes, it has no effect on translation of the host cell’s mRNAs. Resistance by bacteria to streptomycin can be traced to changes in ribosomal proteins, particularly S12.

Chapter 11 Gene Expression: From Transcription to Translation

Termination As shown in Figure 11.40, 3 of the 64 trinucleotide codons function as stop codons that terminate polypeptide assembly rather than encode an amino acid. No tRNAs exist whose anticodons are complementary to a stop codon.11 When a ribosome reaches one of these codons, UAA, UAG, or UGA, the signal is read to stop further elongation and release the polypeptide associated with the last tRNA. Termination requires release factors. Release factors can be divided into two groups: class I RFs, which recognize stop codons in the A site of the ribosome, and class II RFs, which are GTP-binding proteins (G proteins) whose roles are not well understood. Bacteria have two class I RFs: RF1, which recognizes UAA and UAG stop codons, and RF2, which recognizes UAA and UGA stop codons. Eukaryotes have a single class I RF, eRF1, which recognizes all three stop codons. Class I RFs enter the A site of the ribosome as shown in Figure 11.51. Once in the A site, a conserved tripeptide at one end of the release factor is thought to interact with the stop codon in the A site and to trigger a crucial conformational change affecting several nucleotides of the mRNA of the small ribosomal subunit. The ester bond 11

There are minor exceptions to this statement. It is stated in this chapter that codons dictate the incorporation of 20 different amino acids. In actual fact, there is a 21st amino acid, called selenocysteine, that is incorporated into a very small number of polypeptides. Selenocysteine is a rare amino acid that contains the metal selenium. In mammals, for example, it occurs in a dozen or so proteins. Selenocysteine has its own tRNA, called tRNASec, but lacks its own aatRNA synthetase. This unique tRNA is recognized by the seryl-tRNA synthetase, which attaches a serine to the 3⬘ end of the tRNASec. Following attachment, the serine is altered enzymatically to a selenocysteine. Selenocysteine is encoded by UGA, which is one of the three stop condons. Under most circumstances UGA is read as a termination signal. In a few cases, however, the UGA is followed by a folded region of the mRNA that binds a special elongation factor that causes the ribosome to recruit a tRNASec into the A site rather than a termination factor. A 22nd amino acid, pyrrolysine, is encoded by another stop codon (UAG) in the genetic code of some archaea. Pyrrolysine has its own tRNA and aa-tRNA synthetase.

Polypeptide Nascent peptide RF

50S

E

P

A

30S mRNA

Figure 11.51 Structural model of the first step of translational termination in bacteria. When the ribosome reaches a UAA or UAG stop codon, a class I RF enters the A site and becomes aligned in a manner similar to that of an incoming aa-tRNA. A domain of the RF that resides in the peptidyl transferase center of the ribosome promotes the hydrolysis of the ester bond that links the polypeptide to the peptidyltRNA in the P site, thereby releasing the completed polypeptide. (FROM H. S. ZAHER AND R. GREEN, CELL 136:747, 2009, WITH PERMISSION FROM ELSEVIER. COURTESY RACHEL GREEN, JOHNS HOPKINS SCHOOL OF MEDICINE.)

linking the nascent polypeptide chain to the tRNA is then hydrolyzed, and the completed polypeptide is released. At this point, hydrolysis of the GTP bound to the class II RF (RF3 or eRF3) leads to the release of the class I RF from the ribosome’s A site. The final steps in translation include the release of the deacylated tRNA from the P site, dissociation of the mRNA from the ribosome, and disassembly of the ribosome into its large and small subunits in preparation for another round of translation. These final steps require a number of protein factors and are quite different in eukaryotes and bacteria. In bacterial cells, these proteins include EF-G, IF3, and RRF (ribosome recycling factor), which promotes ribosomal subunit separation.

mRNA Surveillance and Quality Control Because the three termination codons can be readily formed by single base changes from many other codons (see Figure 11.40), one might expect mutations to arise that produce stop codons within the coding sequence of a gene (Figure 11.41c). Mutations of this type, termed nonsense mutations, have been studied for decades and are responsible for roughly 30 percent of inherited disorders in humans. Premature termination codons, as they are also called, are also commonly introduced into mRNAs during splicing. Cells possess an mRNA surveillance mechanism capable of detecting messages with premature termination codons. In most cases, mRNAs containing such mutations are translated only once before they are selectively destroyed by a process called nonsense-mediated decay (NMD). NMD protects the cell from producing nonfunctional, shortened proteins. How is it possible for a cell to distinguish between a legitimate termination codon that is supposed to end translation of a message and a premature termination codon? To understand

475

this feat, we have to reconsider events that occur during premRNA processing in mammalian cells. It wasn’t mentioned earlier, but when an intron is removed by a spliceosome, a complex of proteins is deposited on the transcript 20–24 nucleotides upstream from the newly formed exon–exon junction. This landmark of the splicing process is called the exon–junction complex (EJC), which stays with the mRNA until it is translated. In a normal mRNA, the termination codon is typically present in the last exon and the EJC is just upstream from that site. As an mRNA undergoes its initial round of translation, the EJCs are thought to be displaced by the advancing ribosome. Consider what would happen during translation of an mRNA that contained a premature termination codon. The ribosome would stop at the site of the mutation and then dissociate, leaving any EJCs that were attached to the mRNA downstream of the site of premature termination. This sets in motion a series of events leading to the enzymatic destruction of the abnormal message. NMD is best known for its role in eliminating mRNAs transcribed from mutant genes, such as those responsible for cystic fibrosis or muscular dystrophy. For heterozygotes that carry one normal and one disease-causing allele, NMD can eliminate the protein encoded by the mutant allele, thus preventing a potentially toxic effect. For homozygotes carrying

Completed polypeptide + Large subunit

mRNA Large subunit

3' 5'

Small subunit

Small subunit

two disease-causing alleles, NMD can actually prove very harmful, because it prevents the production of shortened, mutant proteins that often possess some activity. A number of biotechnology companies are developing drugs that either interfere with NMD or allow ribosomes to read through a PTC rather than terminating translation. Both types of drugs should allow RNAs with nonsense codons to be translated into proteins. Even though the protein encoded by the mutant gene in the presence of such drugs will be abnormally short or have an abnormal amino acid sequence, it may still possess enough residual function to rescue the patients from an otherwise fatal disease. NMD serves as another reminder of the opportunistic nature of biological evolution. Just as evolution has “taken advantage” of the presence of introns to facilitate exon shuffling (page 454), it has also used the process by which these gene inserts are removed to establish a mechanism of quality control, which ensures that only untainted mRNAs advance to a stage where they can be translated.

Polyribosomes When a messenger RNA in the process of being translated is examined in the electron microscope, a number of ribosomes are invariably seen to be attached along the length of the mRNA thread. This complex of ribosomes and mRNA is called a polyribosome, or polysome (Figure 11.52a). Each of the ribosomes initially assembles from its subunits at the initiation codon and then moves from that point toward the 3⬘ end of the mRNA until it reaches a termination codon. As each ribosome moves away from the initiation codon, another ribosome attaches to the mRNA and begins its translation activity. The rate at which translation initiation occurs varies with the mRNA being studied; some mRNAs have a much greater den-

(a)

Polysome

(c)

data to generate a three-dimensional reconstruction. (c) Electron micrograph of a grazing section through the outer edge of a rough ER cisterna. The ribosomes are aligned in loops and spirals, indicating their attachment to mRNA molecules to form polysomes. (B: FROM FLORIAN BRANDT ET AL., COURTESY OF WOLFGANG BAUMEISTER, CELL 136:267, 2009; BY PERMISSION OF ELSEVIER; C: DON W. FAWCETT/PHOTO RESEARCHERS/GETTY IMAGES, INC.)

11.8 Translating Genetic Information

(b)

Figure 11.52 Polyribosomes. (a) Schematic drawing of a polyribosome (polysome). (b) This three-dimensional model was generated from cryoelectron tomograms of bacterial polysomes in the act of translation in vitro. To obtain the tomograms, preparations were vitrified (frozen into glass-like solid ice without formation of ice crystals) in liquid ethane at ⫺196⬚ C. Electron micrographs were then taken with the specimen positioned at various tilt angles, providing the

476

sity of associated ribosomes than others. The simultaneous translation of the same mRNA by numerous ribosomes greatly increases the rate of protein synthesis within the cell. Recent studies utilizing cryoelectron tomography (Section 18.2) have suggested that the three-dimensional arrangement and orientation of ribosomes within a “free” (i.e., nonmembrane bound) polysome are quite highly ordered. A model of a polysome generated by this technique is shown in Figure 11.52b. The ribosomes that comprise this model polysome are densely packed and have adopted a “double-row” array. Moreover, each of the individual ribosomes is oriented so that its nascent polypeptide (red or green filaments) is situated at its outer surface facing the cytosol. It is suggested that this orientation maximizes the distance between nascent chains, thereby minimizing the likelihood that the nascent chains will interact with one another and, possibly, aggregate. Figure 11.52c shows an electron micrograph of polysomes bound to the cytosolic surface of the ER membrane. It is presumed that these polysomes

had been engaged in the synthesis of membrane and/or organelle proteins at the time the cell was fixed (page 282). The ribosomes within each polysome appear to be organized at the surface of the ER membrane into a circular loop or spiral. Now that we have described the basic events of translation, it is fitting to close the chapter with pictures of the process, one taken from a prokaryotic cell (Figure 11.53a) and the other from a eukaryotic cell (Figure 11.53b). Unlike eukaryotic cells, in which transcription occurs in the nucleus and translation occurs in the cytoplasm with many intervening steps, the corresponding activities in bacterial cells are tightly coupled. Thus protein synthesis in bacterial cells begins on mRNA templates well before the mRNA has been completely synthesized. The synthesis of an mRNA proceeds in the same direction as the movement of ribosomes translating that message, that is, from the 5⬘ to 3⬘ end. Consequently, as soon as an RNA molecule has begun to be produced, the 5⬘ end is available for attachment of ribosomes. The micrograph in Figure 11.53a shows DNA be-

Chapter 11 Gene Expression: From Transcription to Translation

RNA Polymerase Ribosome DNA

(a)

(b)

Figure 11.53 Visualizing transcription and translation. (a) Electron micrograph of portions of an E. coli chromosome engaged in transcription. The DNA is seen as faint lines running the length of the photo, whereas the nascent mRNA chains are seen to be attached at one of their ends, presumably by an RNA polymerase molecule. The particles associated with the nascent RNAs are ribosomes in the act of translation; in bacteria, transcription and translation occur simultaneously. The RNA molecules increase in length as the distance from the

initiation site increases. (b) Electron micrograph of a polyribosome isolated from cells of the silk gland of the silkworm, which produces large quantities of the silk protein fibroin. This protein is large enough to be visible in the micrograph (arrows point to nascent polypeptide chains). (A: FROM OSCAR L. MILLER, JR., BARBARA A. HAMKALO, AND C. A. THOMAS, SCIENCE 169:392, 1970; © 1970, REPRINTED WITH PERMISSION FROM AAAS. B: COURTESY OF STEVEN L. MCKNIGHT AND OSCAR L. MILLER, JR.)

477

ing transcribed, nascent mRNAs being synthesized, and ribosomes that are translating each of the nascent mRNAs. Nascent protein chains are not visible in the micrograph of Figure 11.53a, but they are visible in the micrograph of Figure 11.53b, which shows a single polyribosome isolated from a silk gland cell of a silkworm. The silk protein being synthesized is visible because of its large size and fibrous nature. The development of techniques to visualize transcription and translation by Oscar Miller, Jr. provided a fitting visual demonstration of processes that had been previously described in biochemical terms.

REVIEW 1. Describe some of the ways that the initiation step of translation differs from the elongation steps of translation. 2. How does the effect of a nonsense mutation differ from that of a frameshift mutation? Why? 3. What is a polyribosome? How does its formation differ in prokaryotes and eukaryotes? 4. During translation elongation, it can be said that an aminocyl-tRNA enters the A site, a peptidyl-tRNA enters the P site, and a deacylated tRNA enters the E site. Explain how each of these events occurs.

E X P E R I M E N TA L

P AT H W AY S

The Role of RNA as a Catalyst (NH4)2SO4,mM

5

“Native” Heat - denatured 5" 25 50 120 5 5" 25 50 120 E.coli

23S 16S

IVS RNA

Figure 1 Purified 32P-labeled Tetrahymena ribosomal RNA, transcribed at different (NH4)2SO4 concentrations, was analyzed by polyacrylamide gel electrophoresis. The numbers at the top indicate the concentration of ammonium sulfate. Two groups of samples are shown, “native” and heat-denatured. Samples of the latter group were boiled for five minutes in buffer and cooled on ice to dissociate any molecules that might be held together by hydrogen bonding between complementary bases. The right two columns contain bacterial 16S and 23S rRNAs, which provide markers of known size with which the other bands in the gel can be compared. It can be seen from the positions of the bands that, as the ammonium sulfate concentration rises, the presence of a small RNA appears whose size is equal to that of the isolated intron (IVS, intervening sequence). These data provided the first indication that the rRNA was able to excise the intron without the aid of additional factors. (FROM T. R. CECH ET AL., CELL 27:488, 1981; BY PERMISSION OF ELSEVIER.)

its ability to undergo splicing. The RNA had been extracted with detergent–phenol, centrifuged through a gradient, and treated with a proteolytic enzyme. There were only two reasonable explanations: either splicing was accomplished by a protein that was bound very tightly to the RNA, or the pre-rRNA molecule was capable of splicing itself. The latter idea was not easy to accept. To resolve the question of the presence of a protein contaminant, Cech and co-workers utilized an artificial system that could not possibly contain nuclear splicing proteins.2 The DNA that encodes the rRNA precursor was synthesized in E. coli, purified, and used as a

11.8 Translating Genetic Information

Research in biochemistry and molecular biology through the 1970s had solidified our understanding of the roles of proteins and nucleic acids. Proteins were the agents that made things happen in the cell, the enzymes that accelerated the rates of chemical reactions within organisms. Nucleic acids, on the other hand, were the informational molecules in the cell, storing genetic instructions in their nucleotide sequence. The division of labor between protein and nucleic acids seemed as well defined as any distinction that had been drawn in the biological sciences. Then, in 1981, a paper was published that began to blur this distinction.1 Thomas Cech and his co-workers at the University of Colorado had been studying the process by which the ribosomal RNA precursor synthesized by the ciliated protozoan Tetrahymena thermophila was converted to mature rRNA molecules. The T. thermophila prerRNA contained an intron of about 400 nucleotides that had to be excised from the primary transcript before the adjoining segments could be linked together (ligated). Cech had previously found that nuclei isolated from the cells were able to synthesize the pre-rRNA precursor and carry out the entire splicing reaction. Splicing enzymes had yet to be isolated from any type of cell, and it seemed that Tetrahymena might be a good system in which to study these enzymes. The first step was to isolate the pre-rRNA precursor in an intact state and determine the minimum number of nuclear components that had to be added to the reaction mixture to obtain accurate splicing. It was found that when isolated nuclei were incubated in a medium containing a low concentration of monovalent cations (5 mM NH4⫹), the pre-rRNA molecule was synthesized but the intron was not excised. This allowed the researchers to purify the intact precursor, which they planned to use as a substrate to assay for splicing activity in nuclear extracts. They found, however, that when the purified precursor was incubated by itself at higher concentrations of NH4⫹ in the presence of Mg2⫹ and a guanosine phosphate (e.g., GMP or GTP), the intron was spliced from the precursor (Figure 1).1 Nucleotide-sequence analysis confirmed that the small RNA that had been excised from the precursor was the intron with an added guanine-containing nucleotide at its 5⬘ end. The additional nucleotide was shown to be derived from the GTP that was added to the reaction mixture. Splicing of an intron is a complicated reaction that requires recognition of the sequences that border the intron, cleavage of the phosphodiester bonds on both sides of the intron, and ligation of the adjoining fragments. All efforts had been made to remove any protein that might be clinging to the RNA before it was tested for

478 1

2

3

4

5

6 7

8

9

10

pTyr p4.5 4.5 Tyr 5’–Tyr 5’–4.5

Chapter 11 Gene Expression: From Transcription to Translation

Figure 2 The results of polyacrylamide gel electrophoresis of reaction mixtures that had contained the precursor of tyrosinyl-tRNA (pTyr) and the precursor of another RNA called 4.5S RNA (p4.5). We will focus only on pTyr, which is normally processed by ribonuclease P into two RNA molecules, Tyr and 5⬘-Tyr (which is the 5⬘ end of the precursor). The positions at which these three RNAs (pTyr, Tyr, and 5⬘-Tyr) migrate during electrophoresis are indicated to the left of the gel. Lane 1 shows the RNAs that appear in the reaction mixture when pTyr (and p4.5) are incubated with the complete ribonuclease P. Very little of the pTyr remains in the mixture; it has been converted into the two products (Tyr and 5⬘-Tyr). Lane 5 shows the RNAs that appear in the reaction mixture when pTyr is incubated with the purified protein component of ribonuclease P. The protein shows no ability to cleave the tRNA precursor as evidenced by the absence of bands where the two products would migrate. In contrast, when pTyr is incubated with the purified RNA component of ribonuclease P (lane 7), the pTyr is processed as efficiently as if the intact ribonucleoprotein had been used. (FROM CECILIA GUERRIERTOKADA ET AL. CELL 35:850, 1983; BY PERMISSION OF ELSEVIER.)

template for in vitro transcription by a purified bacterial RNA polymerase. The pre-rRNA synthesized in vitro was then purified and incubated by itself in the presence of monovalent and divalent ions and a guanosine compound. Because the RNA had never been in a cell, it could not possibly be contaminated by cellular splicing enzymes. Yet the isolated pre-rRNA underwent the precise splicing reaction that would have occurred in the cell. The RNA had to have spliced itself. As a result of these experiments, RNA was shown to be capable of catalyzing a complex, multistep reaction. Calculations indicated that the reaction had been speeded approximately 10 billion times over the rate of the noncatalyzed reaction. Thus, like a protein enzyme, the RNA was active in small concentration, was not altered by the reaction, and was able to greatly accelerate a chemical reaction. The primary distinction between this RNA and “standard proteinaceous enzymes” was that the RNA acted on itself rather than on an independent substrate. Cech called the RNA a “ribozyme.” In 1983, a second, unrelated example of RNA catalysis was discovered.3 Sidney Altman of Yale University and Norman Pace of the National Jewish Hospital in Denver were collaborating on the study of ribonuclease P, an enzyme involved in the processing of a transfer RNA precursor in bacteria. The enzyme was unusual in being composed of both protein and RNA. When incubated in buffers containing high concentrations (60 mM) of Mg2⫹, the purified RNA subunit was found capable of removing the 5⬘ end of the tRNA precursor (lane 7, Figure 2) just as the entire ribonuclease P molecule would have done inside the cell. The reaction products of the in vitro reaction included the processed, mature tRNA molecule. In contrast, the isolated protein subunit of the enzyme was devoid of catalytic activity (lane 5, Figure 2).

Figure 3 A molecular model of a portion of the catalytic RNA subunit of bacterial ribonuclease P (white) and its bound substrate, precursor tRNA (red). The site on the precursor tRNA where cleavage by the ribozyme occurs is indicated by a yellow sphere. It is interesting to note that RNase P in human mitochondria is a protein enzyme and does not require an RNA component for catalytic activity. This mitochondrial RNase P enzyme is presumably an example of the evolutionary takcover of a catalytic activity by a protein from an ancient RNA. (COURTESY OF MICHAEL E. HARRIS AND NORMAN R. PACE.)

To eliminate the possibility that a protein contaminant was actually catalyzing the reaction, the RNA portion of ribonuclease P was synthesized in vitro from a recombinant DNA template. As with the RNA that had been extracted from bacterial cells, this artificially synthesized RNA, without any added protein, was capable of accurately cleaving the tRNA precursor.4 Unlike the rRNA processing enzyme studied by Cech, the RNA of ribonuclease P acted on another molecule as a substrate rather than on itself. Thus, it was demonstrated that ribozymes have the same catalytic properties as proteinaceous enzymes. A model of the interaction between the catalytic RNA subunit of ribonuclease P and a precursor tRNA substrate is shown in Figure 3. The demonstration that RNA could catalyze chemical reactions created the proper atmosphere to reinvestigate an old question: Which component of the large ribosomal subunit is the peptidyl transferase, that is, the catalyst for peptide bond formation? During the 1970s, a number of independent findings raised the possibility that ribosomal RNAs might be doing more than simply acting as a scaffold to hold ribosomal proteins in the proper position so they could catalyze translation. Included among the evidence were the following types of data: 1. Certain strains of E. coli carry genes that encode bacteria-killing proteins called colicins. One of these toxins, colicin E3, was known to inhibit protein synthesis in sensitive bacterial cells. Ribosomes that had been isolated from colicin E3–treated cells appeared perfectly normal by most criteria, but they were unable to support protein synthesis in vitro. Further analysis of such ribosomes revealed that the defect resides in the rRNA, not the ribosomal proteins. The 16S RNA of the small subunit is cleaved by the colicin about 50 nucleotides from its 3⬘ end, which renders the entire subunit unable to support protein synthesis.5 2. Treatment of large ribosomal subunits with ribonuclease T1, an enzyme that cleaves bonds between accessible RNA nucleotides, destroyed the subunit’s ability to carry out the peptidyl transferase reaction.6 3. A wide variety of studies on antibiotics that inhibited peptide bond formation, including chloramphenicol, carbomycin, and erythromycin, suggested that these drugs act on the ribosomal RNA, not the

479 protein. For example, it was found that ribosomes became resistant to the effects of chloramphenicol as the result of substitutions in the bases in the ribosomal RNA.7 4. Ribosomal RNAs were shown to have a highly conserved base sequence, much more so than the amino acid sequences of ribosomal proteins. Some of the regions of the ribosomal RNAs are virtually unchanged in ribosomes isolated from prokaryotes, plants, and animals, as well as ribosomes isolated from mitochondria and chloroplasts. The fact that rRNA sequences are so highly conserved suggests that the molecules have a crucial role in the function of the ribosome.8,9 In fact, a 1975 paper by C. R. Woese and co-workers states, “Since little or no correlation exists between these conserved regions and known ribosomal protein binding sites, the implication is strong that large areas of the RNA are directly involved in ribosomal function.”8

References 1. CECH, T. R., ZAUG, A. J., & GRABOWSKI, P. J. 1981. In vitro splicing of the ribosomal RNA precursor of Tetrahymena. Cell 27:487–496. 2. KRUGER, K. ET AL. 1982. Self-splicing RNA: Autoexcision and autocyclization of the ribosomal RNA intervening sequence of Tetrahymena. Cell 31:147–157. 3. GUERRIER-TOKADA, C. ET AL. 1983. The RNA moiety of ribonuclease P is the catalytic subunit of the enzyme. Cell 35:849–857. 4. GUERRIER-TOKADA, C. ET AL. 1984. Catalytic activity of an RNA molecule prepared by transcription in vitro. Science 223:285–286. 5. BOWMAN, C. M. ET AL. 1971. Specific inactivation of 16S ribosomal RNA produced by colicin E3 in vivo. Proc. Nat’l. Acad. Sci. USA. 68:964–968. 6. CERNA, J., RYCHLIK, I., & JONAK, J. 1975. Peptidyl transferase activity of Escherichia coli ribosomes digested by ribonuclease T1. Eur. J. Biochem. 34:551–556. 7. KEARSEY, S. & CRAIG, I. W. 1981. Altered ribosomal RNA genes in mitochondria from mammalian cells with chloramphenicol resistance. Nature 290:607–608. 8. WOESE, C. R. ET AL. 1975. Conservation of primary structure in 16S ribosomal RNA. Nature 254:83–86. 9. NOLLER, H. F. & WOESE, C. R. 1981. Secondary structure of 16S ribosomal RNA. Science 212:403–411. 10. MOAZED, D. & NOLLER, H. F. 1989. Interaction of tRNA with 23S rRNA in the ribosomal A, P, and E sites. Cell 57:585–597. 11. NOLLER, H. F., HOFFARTH, V., & ZIMNIAK, L. 1992. Unusual resistance of peptidyl transferase to protein extraction procedures. Science 256:1416–1419. 12. BAN, N. ET AL. 2000. The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science 289:905–920. 13. WIMBERLY, B.T. et at 2000. Structure of the 30S ribosomal subunit. Nature 407:327–339. 14. SCHLUENZEN, F. ET AL. 2000. Structure of functionally activated small ribosomal subunit at 3.3 A resolution. Cell 102:615–623. 15. NISSEN, P. ET AL. 2000. The structural basis of ribosome activity in peptide bond synthesis. Science 289:920–930. 16. RAMAKRISHNAN, V. 2011. The eukaryotic ribosome. Science 331:681–682. 17. BEN-SHEM, A. ET AL. 2011. The structure of the eukaryotic ribosome at 3.0 A resolution. Science 334:1524–1529. 18. HILLER, D. A. et al. 2011. A two-step chemical mechanism for ribosome catalysed peptide bond formation. Nature 476:236–239.

11.8 Translating Genetic Information

Following the discovery of catalytic RNAs by Cech’s and Altman’s laboratories, investigation into the roles of ribosomal RNAs intensified. A number of studies by Harry Noller and his colleagues at the University of California, Santa Cruz, pinpointed the site in the ribosomal RNA that resides in or around the peptidyl transferase center.9 In one study, it was shown that transfer RNAs bound to the ribosome protect specific bases in the rRNA of the large subunit from attack by specific chemical agents. Protection from chemical attack is evidence that the tRNA must be situated very close to the rRNA bases that are shielded.10 Protection is lost if the 3⬘ end of the tRNA (the end with the CCA bonded to the amino acid) is removed. This is the end of the tRNA involved in peptide bond formation, which would be expected to reside in close proximity to the peptidyl transferase site. Attempts to ascribe a particular function to isolated ribosomal RNA had always met with failure. It was presumed that, even if ribosomal RNAs did have specific functions, the presence of ribosomal proteins were most likely required to keep the rRNAs in their proper conformation. Considering that the ribosomal proteins and rRNAs have coevolved for billions of years, the two molecules would be expected to be interdependent. Despite this expectation, the catalytic power of isolated rRNA was finally demonstrated by Noller and coworkers in 1992.11 Working with particularly stable ribosomes from Thermus aquaticus, a bacterium that lives at high temperatures, Noller treated preparations of the large ribosomal subunit with a protein-extracting detergent (SDS), a protein-degrading enzyme (proteinase K), and several rounds of phenol, a protein denaturant. Together, these agents removed at least 95 percent of the protein from the ribosomal subunit, leaving the rRNA behind. Most of the 5 percent of the protein that remained associated with the rRNA was demonstrated to consist of small fragments of the ribosomal proteins. Yet, despite the removal of nearly all of the protein, the rRNA retained 80 percent of the peptidyl transferase activity of the intact subunit. The catalytic activity was blocked by chloramphenicol and by treatment with ribonuclease. When the RNA was subjected to additional treatments to remove the small amount of remaining protein, the preparation lost its catalytic activity. Because it is very unlikely that the remaining protein had any catalytic importance, it is generally agreed that these experiments demonstrated that peptidyl transferase is a ribozyme. Final confirmation of this conclusion awaited results obtained by X-ray crystallographic studies. As might be expected from the complexity of ribosomal subunits, obtaining high-quality crystals of such particles was a very difficult endeavor. The greatest successes in this field were obtained by Ada Yonath and colleagues at the Weizmann Institute in Israel, who had turned to using ribosomes prepared from prokaryotic extremophiles (page 14) that lived in waters of high temperature or high salinity. Efforts in the X-ray crystallographic analysis of ribosomal subunits finally culminated with the publication in 2000 of high-resolution crystal structures of (1) the large ribosomal subunit

of Haloarcula marismortui, an archaebacterium that lives in the Dead Sea, by Thomas Steitz, Peter Moore, and their colleagues at Yale University,12 and (2) the 30S subunit of the thermophilic bacterium, Thermus thermophilus, by Venkatraman Ramakrishnan and colleagues at Cambridge University13 and by Ada Yonath and colleagues in Germany and Israel.14 A model of the 50S subunit is shown in the chapter-opening image on page 426. To identify the peptidyl transferase site within the large ribosomal subunit, Steitz and colleagues soaked crystals of these subunits with CCdA–phosphate–puromycin, a substance that inhibits peptide bond formation by binding tightly to the peptidyl transferase active site. Determination of the structure of these subunits at atomic resolution reveals the location of the bound inhibitor, and thus the location of the peptidyl transferase site.15 It was found in this study that the active-site inhibitor is bound within a cleft of the subunit that is surrounded entirely by conserved nucleotide residues of the 23S rRNA. In fact, there isn’t a single amino acid side chain of any of the ribosomal proteins within about 18 Å of the site where a peptide bond is synthesized. It would seem hard to argue with the conclusion that the ribosome is a ribozyme. (X-ray crystallographic studies of eukaryotic ribosomes are discussed in references 16 and 17 and the mechanism of action of the 23S rRNA in peptide bond formation in reference 18.)

480

Chapter 11 Gene Expression: From Transcription to Translation

| Synopsis Our understanding of the relationship between genes and proteins came as the result of a number of key observations. The first important observation was made by Garrod, who concluded that persons suffering from inherited metabolic diseases were missing specific enzymes. Later, Beadle and Tatum induced mutations in the genes of Neurospora and identified the specific metabolic reactions that were affected. These studies led to the concept of “one gene–one enzyme” and subsequently to the more refined version of “one gene–one polypeptide chain.” (p. 427) The first step in gene expression is the transcription of a DNA template strand by an RNA polymerase. Polymerase molecules are directed to the proper site on the DNA by binding to a promoter region, which in almost every case lies just upstream from the site at which transcription is initiated. The polymerase moves in the 3⬘ to 5⬘ direction along the template DNA strand, assembling a complementary, antiparallel strand of RNA that grows from its 5⬘ terminus in a 3⬘ direction. At each step along the way, the enzyme catalyzes a reaction in which ribonucleoside triphosphates (NTPs) are hydrolyzed into nucleoside monophosphates as they are incorporated. The reaction is further driven by hydrolysis of the released pyrophosphate. Prokaryotes possess a single type of RNA polymerase that can associate with a variety of different sigma factors that determine which genes are transcribed. The site at which transcription is initiated is determined by a nucleotide sequence located approximately 10 bases upstream from the initiation site. (p. 429) Eukaryotic cells have three distinct RNA polymerases (I, II, and III), each responsible for the synthesis of a different group of RNAs. Approximately 80 percent of a cell’s RNA consists of ribosomal RNA (rRNA). Ribosomal RNAs (with the exception of the 5S species) are synthesized by RNA polymerase I, transfer RNAs and 5S rRNA by RNA polymerase III, and mRNAs by RNA polymerase II. All three types of RNA are derived from primary transcripts that are longer than the final RNA product. RNA processing requires a variety of small nuclear RNAs (snRNAs). (p. 433) Three of the four eukaryotic rRNAs (the 5.8S, 18S, and 28S rRNAs) are synthesized from a single transcription unit, consisting of DNA (rDNA) that is localized within the nucleolus, and are processed by a series of nucleolar reactions. Nucleoli from amphibian oocytes can be dispersed to reveal actively transcribed rDNA, which takes the form of a chain of “Christmas trees.” Each of the trees is a transcription unit whose smaller branches represent shorter RNAs that are at an earlier stage of transcription, that is, closer to the site where RNA synthesis was initiated. Analysis of these complexes reveals the tandem arrangement of the rRNA genes, the presence of nontranscribed spacers separating the transcription units, and the presence of associated ribonucleoprotein (RNP) particles that are involved in processing the transcripts. The steps in the processing of rRNA have been studied by exposing cultured mammalian cells to labeled precursors, such as [14C]methionine, whose methyl groups are transferred to a number of nucleotides of the pre-rRNA. The presence of the methyl groups is thought to protect certain sites on the RNA from cleavage by nucleases and to aid in the folding of the RNA molecule. Approximately half of the 45S primary transcript is removed during the course of formation of the three mature rRNA products. The nucleolus is also the site of assembly of the two ribosomal subunits. (p. 435) Kinetic studies of rapidly labeled RNAs first suggested that mRNAs were derived from much larger precursors. When eukaryotic cells are incubated for one to a few minutes in [3H]uridine or other labeled RNA precursors, most of the label is incorporated

into a group of RNA molecules of very large molecular weight and diverse nucleotide sequence that are restricted to the nucleus. These RNAs are referred to as heterogeneous nuclear RNAs (or hnRNAs). When cells that have been incubated briefly with [3H]uridine are chased in a medium containing unlabeled precursors for an hour or more, radioactivity appears in smaller cytoplasmic mRNAs. This and other evidence, such as the presence of 5⬘ caps and 3⬘ poly(A) tails on both hnRNAs and mRNAs, led to the conclusion that hnRNAs were the precursors of mRNAs. (p. 441) Pre-mRNAs are synthesized by RNA polymerase II in conjunction with a number of general transcription factors that allow the polymerase to recognize the proper DNA sites and to initiate transcription at the proper nucleotide. In many genes, the promoter lies upstream from the site of initiation in a region containing the TATA box. The TATA box is recognized by the TATA-binding protein (TBP), whose binding to the DNA initiates the assembly of a preinitiation complex. Phosphorylation of a portion of the RNA polymerase leads to the disengagement of the polymerase and the initiation of transcription. (p. 442) One of the most important revisions in our concept of the gene came in the late 1970s with the discovery that the coding regions of a gene did not form a continuous sequence of nucleotides. The first observations in this regard were made on studies of transcription of the adenovirus genome in which it was found that the terminal portion of a number of different messenger RNAs are composed of the same sequence of nucleotides that is encoded by several discontinuous segments in the DNA. The regions between the coding segments are called intervening sequences, or introns. A similar condition was soon found to exist for cellular genes, such as those that code for ␤-globin and ovalbumin. In each case, the regions of DNA that encode portions of the polypeptide (the exons) are separated from one another by noncoding regions (the introns). Subsequent analysis indicated that the entire gene is transcribed into a primary transcript. The regions corresponding to the introns are subsequently excised from the pre-mRNA, and adjacent coding segments are ligated together. This process of excision and ligation is called RNA splicing. (p. 444) The major steps in the processing of primary transcripts into mRNA include the addition of a 5⬘ cap, formation of a 3⬘ end, addition of a 3⬘ poly(A) tail, and removal of introns. The formation of the 5⬘ cap occurs by stepwise reactions in which the terminal phosphate is removed, a GMP is added in an inverted orientation, and methyl groups are transferred to both the added guanosine and the first nucleotide of the transcript itself. The final 3⬘ end of the mRNA is generated by cleavage of the primary transcript and the addition of adenosine residues one at a time by poly(A) polymerase. Removal of the introns from the primary transcript depends on the presence of invariant residues at both the 5⬘ and 3⬘ splice sites on either side of each intron. Splicing is accomplished by a multicomponent spliceosome that contains a variety of proteins and ribonucleoprotein particles (snRNPs) that assemble in a stepwise fashion at the site of intron removal. Studies suggest that the snRNAs of the spliceosomes, possibly in conjunction with proteins, are the catalytically active components of the snRNPs. One of the apparent evolutionary benefits of split genes is the ease with which exons can be shuffled within the genome, generating new genes from portions of preexisting ones. (p. 448) Most eukaryotic cells have a mechanism called RNA interference induced by double-stranded RNAs (siRNAs) that leads to the destruction of complementary mRNAs. RNA interference is thought to have evolved as a defense against viral infection and/or

481 transposon mobility. Researchers have taken advantage of the phenomenon to stop the synthesis of specific proteins by targeting their mRNAs. Eukaryotic genomes encode large numbers of small microRNAs (miRNAs) (20–25 nucleotides) that regulate translation of specific mRNAs. Both siRNAs and miRNAs are generated by a common processing machinery, which includes the enzyme Dicer that cleaves the precursor, and an effector complex RISC, which holds the single-stranded guide RNA that either cuts the mRNA or blocks its translation. A third class of small regulatory RNAs, called piRNAs, are formed from single-stranded RNA precursors and do not require Dicer during biogenesis. piRNAs are active in suppressing transposable element mobility in germ cells. (p. 455) Information for the incorporation of amino acids into a polypeptide is encoded in the sequence of triplet codons in the mRNA. In addition to being triplet, the genetic code is nonoverlapping and degenerate. In a nonoverlapping code, each nucleotide is part of one, and only one, codon, and the ribosome must move along the message three nucleotides at a time. The ribosome attaches to the mRNA at the initiation codon, AUG, which automatically puts the ribosome in the proper reading frame so that it correctly reads the entire message. A triplet code constructed of 4 different nucleotides can have 64 (43) different codons. The code is degenerate because many of the 20 different amino acids have more than 1 codon. Of the 64 possible codons, 61 specify an amino acid, while the other 3 are stop codons that cause the ribosome to terminate translation. The codon assignments are essentially universal, and their sequences are such that base substitutions in the mRNA tend to minimize the effect on the polypeptide. (p. 461) Information in the nucleotide alphabet of DNA and RNA is decoded by transfer RNAs during the process of translation. Transfer RNAs are small RNAs (73 to 93 nucleotides long) that share a similar, L-shaped, three-dimensional structure and a number of invariant residues. One end of the tRNA bears the amino acid, and the other end contains a three-nucleotide anticodon sequence that is complementary to the triplet codon of the mRNA. The steric requirements of complementarity between codon and anticodon are relaxed at the third position of the codon to allow different codons

that specify the same amino acid to use the same tRNA. It is essential that each tRNA is linked to the proper (cognate) amino acid, that is, the amino acid specified by the mRNA codon to which the tRNA anticodon binds. Linkage of tRNAs to their cognate amino acids is accomplished by a group of enzymes called aminoacyl-tRNA synthetases. Each enzyme is specific for one of the 20 amino acids and is able to recognize all of the tRNAs to which that amino acid must be linked. (p. 464) Protein synthesis is a complex synthetic activity involving all the various tRNAs with their attached amino acids, ribosomes, messenger RNA, a number of proteins, cations, and GTP. The process is divided into three distinct activities: initiation, elongation, and termination. The primary activities of initiation include the precise attachment of the small ribosomal subunit to the initiation codon of the mRNA, which sets the reading frame for the entire translational process; the entry into the ribosome of a special initiator tRNA; and the assembly of the translational machinery. During elongation, a cycle of tRNA entry, peptide bond formation, and tRNA exit repeats itself with every amino acid incorporated. Aminoacyl-tRNAs enter the A site, where they bind to the complementary codon of the mRNA. After tRNA enters, the nascent polypeptide attached to the tRNA of the P site is transferred to the amino acid on the tRNA of the A site, forming a peptide bond. Peptide bond formation is catalyzed by a portion of the large rRNA acting as a ribozyme. In the last step of elongation, the ribosome translocates to the next codon of the mRNA, as the deacylated tRNA of the P site is transferred to the E site, and the deacylated tRNA that was in the E site is released from the ribosome. Both initiation and elongation require GTP hydrolysis. Translation is terminated when the ribosome reaches one of the three stop codons. After a ribosome assembles at the initiation codon and moves a short distance toward the 3⬘ end of the mRNA, another ribosome generally attaches to the initiation codon so that each mRNA is translated simultaneously by a number of ribosomes, which greatly increases the rate of protein synthesis within the cell. The complex of an mRNA and its associated ribosomes is a polyribosome. (p. 468)

| Analytic Questions 1. Look at the codon chart of Figure 11.40; which codons would

5. Suppose you were to construct a synthetic RNA from a repeat-

you expect to have a unique tRNA, that is, one that is used only for that codon? Why is it that many codons do not have their own unique tRNA? 2. Proflavin is a compound that inserts itself into DNA and causes frameshift mutations (page 473). How would the effect on the amino acid sequence of a proflavin-induced mutation differ between an overlapping and a nonoverlapping code? 3. You have just isolated a new drug that has only one effect on cell metabolism; it totally inhibits the breakdown of pre-rRNA to ribosomal RNA. After treating a culture of mammalian cells with this drug, you give the cells [3H]uridine for 2 minutes, and then grow the cells in the presence of the drug in unlabeled medium for 4 hours before extracting the RNA and centrifuging it through a sucrose gradient. Draw the curves you would obtain by plotting both absorbance at 260 nm and radioactivity against the fraction number of the gradient. Label the abscissa (X-axis) using S values of RNA. 4. Using the same axes as in the previous question, draw the profile of radioactive RNA that you would obtain after a culture of mammalian cells had been incubated for 48 hours in [3H]uridine without any inhibiting drugs present.

ing dinucleotide (e.g., AGAGAGAG) and then use this RNA as a messenger to synthesize a polypeptide in an in vitro protein-synthesizing system, such as that used by Nirenberg and Matthaei to produce polyphenylalanine. What type of polypeptide would you make from this particular polynucleotide? Would you expect to have more than one type of polypeptide produced? Why or why not? 6. Suppose that you found an enzyme that incorporated nu-

cleotides randomly into a polymer without the requirement for a template. How many different codons would you be able to find in synthetic RNAs made using two different nucleotide precursors (e.g., CTP and ATP)? (An enzyme called polynucleotide phosphorylase catalyzes this type of reaction and was used in studies that identified codons.) coding portions. 8. What is the minimum number of GTPs that you would need to

synthesize a pentapeptide in a bacterium? 9. On the same set of axes as in Figure 10.17, draw two reannealing

Analytic Questions

7. Draw the parts of a 15S globin pre-mRNA, labeling the non-

482

10.

11.

12.

13.

Chapter 11 Gene Expression: From Transcription to Translation

14.

curves, one for DNA extracted from Xenopus brain tissue and the other for DNA extracted from Xenopus oocytes. Would you agree with the following statement? The discovery that sickle cell anemia resulted from a single amino acid change was proof that the genetic code was nonoverlapping. Why or why not? Thalassemia is a disease characterized by mutations that convert amino acid codons into stop codons. Suppose you were to compare the polypeptides synthesized in vitro from mRNAs purified from a wide variety of thalassemia patients. How would you expect these polypeptides to compare? Refer to the codon chart of Figure 11.40; how many amino acid codons can be converted into stop codons by a single base substitution? Do you think it would be theoretically possible to have a genetic code with only two letters, A and T? If so, what would be the minimum number of nucleotides required to make a codon? On page 454, experiments are reported that lead to the synthesis of ribozymes with unique catalytic properties. In 2001, an artificial ribozyme was isolated that was capable of incorporating up to 14 ribonucleotides onto the end of an existing RNA using an RNA strand as a template. The ribozyme could use any RNA sequence as a template and would incorporate complementary nucleotides into the newly synthesized RNA strand with an accuracy of 98.5 percent. If you were a proponent of an ancient RNA world, how would you use this finding to argue your case? Would this prove the existence of an ancient RNA world? If not, what do you think would provide the strongest evidence for such a world? How is it possible that mRNA synthesis occurs at a greater rate in bacterial cells than any other class, yet very little mRNA is present within the cell?

15. If a codon for serine is 5⬘-AGC-3⬘, the anticodon for this triplet

16.

17.

18.

19.

20.

would be 5⬘-— — —-3⬘. How would the wobble phenomenon affect this codon–anticodon interaction? One of the main arguments that proteins evolved before DNA (i.e., that the RNA world evolved into an RNA–protein world rather than an RNA–DNA world) is based on the fact that the translational machinery involves a large variety of RNAs (e.g., tRNAs, rRNAs), whereas the transcriptional machinery shows no evidence of RNA involvement. Can you explain how such an argument about the stages of early evolution might be based on these observations? Frameshift mutations and nonsense mutations were described on page 474. It was noted that nonsense mutations often lead to the destruction by NMD of an mRNA containing a premature termination codon. Would you expect mRNAs containing frameshift mutations to be subject to NMD? The arrowheads in Figure 11.16 indicate the direction of transcription of the various tRNA genes. What does this drawing tell you about the template activity of each strand of a DNA molecule within a chromosome? Genes are usually discovered by finding an abnormal phenotype resulting from a genetic mutation. Alternatively, they can often be identified by examining the DNA sequence of a genome. Why do you suppose that the genes encoding miRNAs were not discovered until very recently? It was suggested on page 464 that synonymous codon changes do not generally alter an organism’s phenotype. Can you think of an occasion where this might not be true? What does this say about the coding requirements of the genetic material?

483

12 Control of Gene Expression 12.1 Control of Gene Expression in Bacteria 12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus 12.3 An Overview of Gene Regulation in Eukaryotes 12.4 Transcriptional Control 12.5 RNA Processing Control 12.6 Translational Control 12.7 Posttranslational Control: Determining Protein Stability THE HUMAN PERSPECTIVE: Chromosomal Aberrations and Human Disorders

I

n spite of their obvious differences in form and function, the cells that make up a multicellular organism contain a complete set of genes. The genetic information present in all cells can be compared to a book of blueprints prepared for the construction of a large, multipurpose building. At different times, all of the blueprints will be needed, but only a small subset is consulted during the work on a particular floor or room. The same is true for a fertilized egg, which contains a complete set of genetic instructions that is faithfully replicated and distributed to each cell of a developing organism, but only a subset of genes is expressed in any particular cell. Single cell organisms such as bacteria and protists also contain a complete set of genes, but only a subset is expressed at any one time, depending on environmental cues or food sources. Thus the cells of all organisms carry much more genetic information than they will

Genes operate as parts of interacting networks. This map displays the functional interactions of approximately 1700 genes from the budding yeast S. cerevesiae. In order to determine which of these genes participate in a related cellular process or act in the same organelle, cells with mutations in two randomly selected genes were screened. To study this number of genes, researchers had to examine approximately 5.4 million different combinations of mutant genes spanning all biological processes. Two genes were determined to be involved in a related activity if the phenotype of the double mutant was more extreme than would be expected from the combined phenotype of the two single mutants (i.e., cells containing only one of the two mutant genes). Overall, the study identified 170,000 gene–gene interactions. The results are summarized in this genetic interaction map, which connects genes with similar genetic interaction profiles. Each gene is depicted as a single dot. Genes sharing similar patterns of genetic interactions are positioned closer to each other, whereas genes with less similar patterns are positioned farther apart. Colored regions indicate sets of genes enriched for different biological processes, which are identified by the indicated labels. (REPRINTED WITH PERMISSION FROM MICHAEL COSTANZO ET AL., SCIENCE 327: 425, 2010, COURTESY OF CHARLES BOONE, UNIVERSITY OF TORONTO; © 2010, REPRINTED WITH PERMISSION FROM AAAS.)

484 ever use at any particular moment. Cells possess mechanisms that allow them to precisely regulate their genetic information, expressing genes only when they are needed. In this chapter, we will explore some of the ways in which prokaryotic and eukaryotic cells control gene expression and thereby ensure that certain RNAs and proteins are synthesized, while others are not. Much

of what we know about the control of gene expression is based on studies examining a single gene under different circumstances. However, with the advent of new technologies and the sequencing of entire genomes, we are beginning to understand how the entire repertoire of expressed genes are regulated (as illustrated in the chapter-opening figure).

12.1 | Control of Gene Expression in Bacteria Organization of Bacterial Genomes Prokaryotic genomes are organized in a number of different ways, but circular double-stranded DNA genomes are a common arrangement. For such organisms, there is hardly any excess DNA; nearly all the DNA encodes RNAs or proteins with little spacing between individual transcription units. In addition, genes involved in the same biological process are often grouped together to allow coordinate regulation of the entire group. For example, there may be a set of genes involved in flagella formation directly adjacent to a group of genes involved in sugar metabolism. Because of this arrangement, it is essential that decisions on where and when to start and stop transcription or translation are precisely regulated.

in the medium, the cells no longer have to synthesize this amino acid, and, within a few minutes, production of the enzymes needed to synthesize tryptophan is repressed. In bacteria, the genes that encode the enzymes needed to synthesize tryptophan are clustered together on the chromosome in a functional complex called an operon. All of the genes of an operon are coordinately controlled by a mechanism first described in 1961 by Francois Jacob and Jacques Monod of the Pasteur Institute in Paris. A typical bacterial operon consists of structural genes, a promoter region, an operator region, and a regulatory gene (Figure 12.2). ■

Structural genes encode the enzymes themselves. The structural genes of an operon usually lie adjacent to one another, and RNA polymerase moves from one structural gene to the next, transcribing all of the genes into a single mRNA. An mRNA containing information for more than one polypeptide is called a polycistronic mRNA. The poly-

β–Galactosidase

1000

mRNA 100 Amount of mRNA

A bacterial cell lives in direct contact with its environment, which may change dramatically in chemical composition or temperature from one moment to the next. At certain times, a particular food source may be present, while at other times that compound is absent. Consider the consequences of transferring a culture of bacteria from a minimal medium to one containing either (1) lactose or (2) tryptophan. 1. Lactose is a disaccharide (see Figure 2.16) composed of glucose and galactose whose oxidation can provide the cell with metabolic intermediates and energy. The first step in the catabolism (i.e., degradation) of lactose is the hydrolysis of the bond that joins the two sugars (a -galactoside linkage), a reaction catalyzed by the enzyme -galactosidase. When bacterial cells are growing under minimal conditions, the cells have no need for -galactosidase. Under minimal conditions, an average cell contains fewer than five copies of -galactosidase and a single copy of the corresponding mRNA. Within a few minutes after adding lactose to the culture medium, cells contain approximately 1000 times the number of -galactosidase molecules. The presence of lactose has induced the synthesis of this enzyme (Figure 12.1). 2. Tryptophan is an amino acid required for protein synthesis. In humans, tryptophan is an essential amino acid; it must come from the diet. In contrast, bacterial cells can synthesize tryptophan in a series of reactions requiring the activity of multiple enzymes. Cells growing in the absence of tryptophan activate the genes encoding these enzymes. If, however, tryptophan should become available

β–Galactosidase (units/ml) or growth (mg bacteria/ml)

Chapter 12 Control of Gene Expression

The Bacterial Operon

Growth 10

1.0

0.1 0

5

10

15

Minutes Addition of inducer

Elimination of inducer

Figure 12.1 Kinetics of ␤-galactosidase induction in E. coli. When a -galactoside such as lactose is added to a culture of these bacteria, the production of mRNA for the enzyme -galactosidase begins very rapidly, followed within a minute or so by the appearance of the protein, whose concentration increases rapidly. Removal of the inducer leads to a precipitous drop in the level of the mRNA, which reflects its rapid degradation. The amount of enzyme then levels off because new molecules are no longer synthesized.

485 The operon’s components are shown in orange.

Bacterial DNA

Regulatory gene

Repressor protein

Promoter (P)

Gene1

Operator (O)

■

■

Gene3

Structural genes (code for enzymes of the same metabolic pathway)

Figure 12.2 Organization of a bacterial operon. The enzymes that make up a metabolic pathway are encoded by a series of structural genes that reside in a contiguous array within the bacterial chromosome. The structural genes of an operon are transcribed into a single polycistronic mRNA, which is translated into separate polypep-

■

Gene2

cistronic mRNA is then translated into the various individual enzymes of the metabolic pathway. The promoter is the site where the RNA polymerase binds to the DNA prior to beginning transcription (discussed on page 442). The operator typically resides adjacent to or overlaps with the promoter (see Figure 12.4) and serves as the binding site for a protein, usually a repressor. The repressor is an example of a gene regulatory protein—a protein that recognizes a specific sequence of base pairs within the DNA and binds to that sequence with high affinity. As will be evident in the remaining sections of this chapter, DNAbinding proteins, such as bacterial repressors, play a prominent role in determining whether or not a particular segment of the genome is transcribed. The regulatory gene encodes the repressor protein.

The trp Operon In a repressible operon, such as the tryptophan (or trp) operon, the repressor is unable to bind to the operator DNA by itself. Instead, the repressor is active as a DNA-binding protein only when complexed with a specific factor, such as tryptophan (Figure 12.3a), which functions as a corepressor. In the absence of tryptophan, the conformation of the repressor does not allow binding to the operator sequence,

thus permitting RNA polymerase to bind to the promoter and transcribe the structural genes of the trp operon, leading to production of the enzymes that synthesize tryptophan. Once tryptophan becomes available, the enzymes of the tryptophan biosynthetic pathway are no longer required. Under these conditions, the increased concentration of tryptophan leads to the formation of the tryptophan–repressor complex, which binds to the operator and blocks transcription. The lac Operon The lac operon is a cluster of genes that regulates production of the enzymes needed to degrade lactose in bacterial cells. The lac operon is an example of an inducible operon, in which the presence of a key metabolic substance (in this case, lactose) induces transcription of the operon, allowing synthesis of the proteins encoded by the structural genes (Figure 12.3b). The lac operon contains three tandem structural genes: the z gene, which encodes -galactosidase; the y gene, which encodes galactoside permease, a protein that promotes entry of lactose into the cell; and the a gene, which encodes thiogalactoside transacetylase, an enzyme whose physiologic role is unclear. If lactose is present in the medium, the disaccharide enters the cell via limiting amounts of galactoside permease where it binds to the lac repressor, changing the conformation of the repressor and making it unable to attach to the DNA of the operator. This frees RNA polymerase to transcribe the operon followed by translation of the three encoded proteins. Thus, in an inducible operon the repressor protein binds to the DNA only in the absence of lactose, which functions as the inducer.1 As the concentration of lactose in the medium decreases, the disaccharide dissociates from its binding site on the repressor molecule, 1

The actual inducer is allolactose, which is derived from and differs from lactose by the type of linkage joining the two sugars. This feature is ignored in the discussion.

12.1 Control of Gene Expression in Bacteria

The key to operon expression lies in the sequence of the operator and the presence or absence of a repressor. When the repressor binds the operator (Figure 12.3), it prevents RNA polymerase from initiating transcription. The capability of the repressor to bind the operator and inhibit transcription depends on the conformation of the repressor, which is regulated allosterically by a key compound in the metabolic pathway, such as lactose or tryptophan, as described shortly. It is the concentration of this key metabolic substance that determines whether the operon is active or inactive at any given time.

tides. Transcription of the structural genes is controlled by a repressor protein that is synthesized by a regulatory gene, which is also part of the operon. When bound to the operator site of the DNA, the repressor protein blocks movement of RNA polymerase from the promoter to the structural genes.

486 REPRESSIBLE OPERON

INDUCIBLE OPERON

Corepressor (e.g., tryptophan)

Inducer (e.g., lactose)

Inactive repressor 1

1

Active repressor

Active repressor Inactive repressor

Structural genes

2

P

O

E

D

C

B

A

Transcription is blocked Repressed state

Induced state

2

Structural genes

3

RNA polymerase

P

O

z

y

a

Transcription 3

Its biosynthesis halted, tryptophan's concentration falls as it is utilized.

RNA polymerase 4

mRNA

Inactive repressor Derepressed state 4

P

O 5

E

D

C

B

Enzymes

A 5

Transcription Catabolic pathway Substrate (lactose)

Concentration of lactose falls as it is degraded.

mRNA Enzymes 6

Repressed state P

8 6

7

z

y

a

Transcription is blocked

End product (tryptophan)

(a)

Chapter 12 Control of Gene Expression

O

Figure 12.3 Gene regulation by operons. Inducible and repressible operons work on a similar principle: if the repressor is able to bind to the operator, genes are turned off; if the repressor is inactivated and unable to bind to the operator, genes are expressed. (a) In a repressible operon, such as the trp operon, the repressor, by itself, is unable to bind to the operator, and the structural genes encoding the enzymes are actively transcribed. The enzymes of the trp operon catalyze a series of reactions that result in the synthesis of the essential amino acid tryptophan. (1) When tryptophan is plentiful, tryptophan molecules act as corepressors by binding to the inactive repressor and (2) change its shape, allowing it to bind to the operator, (3) preventing transcription of the structural genes. Thus, when tryptophan concentration is high, the operon is repressed, preventing overproduction of tryptophan. (4) When the tryptophan concentration

(b)

is low, the repressor molecules lack a corepressor and therefore fail to bind to the operator allowing transcription (5) and translation (6) of enzymes (7) needed to synthesize tryptophan (8). (b) In an inducible operon, (1) the inducer (in this case, the disaccharide lactose) binds to the repressor protein and (2) prevents its binding to the operator (O). (3) Without the repressor in the way, RNA polymerase attaches to the promoter (P) and transcribes the structural genes. Thus, when the lactose concentration is high, the operon is induced, and the sugardigesting enzymes encoded by the lac operon are transcribed and translated (4). As the sugar is metabolized (5), its concentration dwindles, causing bound inducer molecules to dissociate from the repressor, which then regains its ability to attach to the operator and prevent transcription (6). Thus, when the inducer concentration is low, the operon is repressed, preventing synthesis of unneeded enzymes.

487

which allows the repressor to again bind to the operator and repress transcription. Catabolite Repression Repressors, such as those of the lac and trp operons, exert their influence by negative control, as the interaction of the DNA with this protein inhibits gene expression. The lac operon is also under positive control, as was discovered during an early investigation of a phenomenon called the glucose effect. If bacterial cells are supplied with glucose as well as a variety of other substrates, such as lactose or galactose, the cells catabolize the glucose first and ignore the other sugars. The glucose in the medium acts to repress the production of various catabolic enzymes, such as -galactosidase, that would allow utilization of the other sugars. In 1965, a surprising finding was made: cyclic AMP (cAMP), previously thought to be involved only in eukaryotic metabolism, was detected in cells of E. coli. It was found that the concentration of cAMP in the cells was related to the presence of glucose in the medium; the higher the glucose concentration, the lower the cAMP concentration. Furthermore, when cAMP was added to the medium in the presence of glucose, the catabolic enzymes that were normally absent were suddenly synthesized by the cells. Although the exact means by which glucose lowers the concentration of cAMP has still not been elucidated, the mechanism by which cAMP overcomes the effect of glucose is well understood. cAMP acts in prokaryotic cells by binding to a protein, the cAMP receptor protein (CRP). By itself, CRP is unable to bind to DNA. However, the cAMP-CRP complex recognizes and binds to a specific site in the lac control region (Figure 12.4). The presence of the bound cAMP-CRP causes a change in the conformation of the DNA, which makes it possible for RNA polymerase to transcribe the lac operon. The binding site in the promoter of the lac operon for RNA polymerase is not a high-affinity binding site, so initiation of transcription is extremely inefficient except in the presence of the cAMP-CRP complex. Even when lactose is present and the lac repressor is inactivated, RNA polymerase cannot transcribe the lac operon unless the levels of cAMP-CRP are

high. Because of the inverse relationship between cAMP levels and glucose levels, transcription of the lac operon is thus regulated by glucose levels. As long as glucose is abundant, cAMP concentrations remain below that required to promote transcription of the operon. Attenuation Because the trp repressor can only bind the operator in the presence of tryptophan, the concentration of tryptophan serves as a feedback regulator controlling transcription of the operon. A second form of feedback regulation controls the trp operon by regulating transcription termination, a mechanism referred to as attenuation. In the presence of high concentrations of tryptophan, RNA polymerase ceases transcription shortly after initiation in a region called the leader sequence. If the concentration of tryptophan is low, transcription does not terminate until the entire operon is transcribed. The mechanism of attenuation links alternative RNA secondary structures to transcription termination. Immediately after transcription, RNA from the leader region folds into one of two alternative secondary structures. One of these structures is a transcription termination signal that stops RNA polymerase from continuing to the end of the operon. The alternative structure does not contain a transcription termination signal and allows transcription of a single mRNA that encodes all the structural genes. The decision as to which of the two alternative RNA structures formed is regulated by the concentration of tryptophan.

Riboswitches Over the past few years a different type of mechanism has captured the interest of researchers studying bacterial gene regulation. Proteins such as the lac and trp repressors are not the only gene regulatory molecules that are influenced by interaction with small metabolites. A number of bacterial mRNAs have been identified that can bind a small metabolite, such as glucosamine or adenine, with remarkable specificity. The metabolite binds to a highly structured 5 noncoding region of the mRNA. Once bound to the metabolite, these mRNAs, or

Promoter

Operator

Met

RNA Polymerase binding site

fMet Thr

Stop

Gly Gln

Glu Ser

Z gene cAMP-CRP binding site

mRNA DNA sequence

GAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATG CTTTCGCCCGTCACTCGCGTTGCGTTAATTACACTCAATCGAGTGAGTAATCCGTGGGGTCCGAAATGTGAAATACGAAGGCCGAGCATACAACACACCTTAACACTCGCCTATTGTTAAAGTGTGTCCTTTGTCGATACTGGTAC

_ 100

_ 90

_ 80

_ 70

_ 60

_ 50

_ 40

Figure 12.4 Nucleotide sequence of the control regions of the lac operon. The binding site for the sigma subunit of RNA polymerase (page 432) is a low-affinity binding site in the lac operon. Transcription of the operon is highly inefficient except in the presence of a complex between cAMP and the CRP protein. Because the concentration of cAMP is inversely proportional to the levels of glucose, the lac operon will not be activated unless glucose levels are low. The site of initiation

_ 30

_ 20

_ 10

+1

+10

+20

+30

3' 5'

+40

of transcription is denoted as 1, which is approximately 40 nucleotides upstream from the site at which translation is initiated. Regions of sequence symmetry in the CRP site and operator are indicated by the red horizontal line. (FROM R. C. DICKSON ET AL., SCIENCE 187:32, 1975; COPYRIGHT 1975, REPRINTED WITH PERMISSIONS FROM AAAS.)

12.1 Control of Gene Expression in Bacteria

I gene

488

riboswitches, as they are called, undergo a change in their folded conformation that allows them to alter the expression of a gene involved in production of that metabolite. Thus riboswitches act by means of a feedback mechanism similar to the alternative RNA structures that regulate attenuation in the trp operon. Most riboswitches suppress gene expression by blocking either termination of transcription or initiation of translation. Like the repressors that function in conjunction with operons, riboswitches allow cells to adjust their level of gene expression in response to changes in the available levels of certain metabolites. Given that they act without the participation of protein cofactors, riboswitches are likely another legacy from an ancestral RNA world (page 454).

synthesis of ribosomal RNA and the assembly of ribosomes (discussed on page 435); and (3) the nucleoplasm, the fluid substance in which the solutes of the nucleus are dissolved.

The Nuclear Envelope The separation of a cell’s genetic material from the surrounding cytoplasm may be the single most important feature that distinguishes eukaryotes from prokaryotes, which makes the appearance of the nuclear envelope a landmark in biological evolution. The nuclear envelope consists of two cellular

REVIEW

Heterochromatin

1. Describe the cascade of events responsible for the sudden changes in gene expression in a bacterial cell following the addition of lactose. How does this compare with the events that occur in response to the addition of tryptophan? 2. What is the role of cyclic AMP in the synthesis of ␤-galactosidase? 3. What is a riboswitch?

Chapter 12 Control of Gene Expression

12.2 | Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus A primary difference between prokaryotic and eukaryotic organisms is the presence of a nucleus in eukaryotic cells. Moreover, most eukaryotic organisms have much larger genomes, which are present as double-stranded DNA molecules arranged in linear chromosomes that can be easily visualized during mitosis (as in Figure 12.22b). Before discussing regulation of gene expression in eukaryotes, we will first discuss the consequences of compartmentalizing genomes within nuclei and how large genomes are packaged into DNA–protein complexes called chromatin. The DNA in prokaryotic organisms is not packaged into chromatin; as a result, DNA-binding proteins such as RNA polymerase, repressors, and CRP-cAMP complexes can directly bind to their preferred binding sites. In contrast, DNA-binding proteins in eukaryotes have to find their preferred binding sites in the context of a complicated DNA–protein complex. Considering its importance in the storage and utilization of genetic information, the nucleus of a eukaryotic cell has a rather undistinguished morphology (Figure 12.5). The contents of the nucleus are present as a viscous, amorphous mass of material enclosed by a complex nuclear envelope that forms a boundary between the nucleus and cytoplasm. Included within the nucleus of a typical interphase (i.e., nonmitotic) cell are (1) the chromosomes, which are present as highly extended nucleoprotein fibers, termed chromatin; (2) one or more nucleoli, which are irregularly shaped electron-dense structures that function in the

Nuclear Envelope Nucleolus

(a)

Nuclear envelope

Nuclear pore

Nucleolus

Nucleoplasm Chromatin

(b)

Figure 12.5 The cell nucleus. (a) Electron micrograph of an interphase HeLa cell nucleus. Heterochromatin (page 498) is evident around the entire inner surface of the nuclear envelope. Two prominent nucleoli are visible, and clumps of chromatin can be seen scattered throughout the nucleoplasm. (b) Schematic drawing showing some of the major components of the nucleus. (A: FROM WERNER W. FRANKE, INT. REV. CYTOL. (SUPPL.) 4:130, 1974., WITH PERMISSION FROM ELSEVIER.)

489 Actin Filaments

Nesprin 1/2 Integral protein

Cytoplasm

Outer nuclear membrane Nesprin 3

Inner nuclear membrane

NM

Rough ER

HC (b)

Intermembrane Space Nuclear pore complex

Lamina

Heterochromatin

(a)

(b)

Figure 12.7 The nuclear lamina. (a) Nucleus of a cultured human cell that has been stained with fluorescently labeled antibodies against lamin A/C to reveal the nuclear lamina (red), which lies on the inner surface of the nuclear envelope. A protein that is proposed to be part of a nuclear matrix (or nuclear scaffold) appears green. (b) Electron micrograph of a freeze-dried, metal-shadowed nuclear envelope of a Xenopus oocyte that has been extracted with the nonionic detergent Triton X-100. The lamina appears as a rather continuous meshwork comprising filaments oriented roughly perpendicular to one another. Inset shows a well-preserved area from which nuclear pores have been mechanically removed. (c) These micrographs show the nucleus within a fibroblast that had been cultured from either a patient with HGPS

Figure 12.6 The nuclear envelope. (a) Schematic drawing showing the double membrane, nuclear pore complex, nuclear lamina, and the continuity of the outer membrane with the rough endoplasmic reticulum (ER). Both membranes of the nuclear envelope contain their own distinct complement of proteins. The actin filaments and intermediate filaments of the cytoskeleton are connected to the outer nuclear membrane by fibrous proteins (Nesprins). (b) Electron micrograph of a section through a portion of the nuclear envelope of an onion root tip cell. Note the double membrane (NM) with intervening space, nuclear pore complexes (NPC), and associated heterochromatin (HC) that does not extend into the region of the nuclear pores. (B: FROM WERNER W. FRANKE ET AL. J. CELL BIOL. 91:47S, 1981, FIG. 8. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

reticulum. The space between the membranes is continuous with the ER lumen (Figure 12.6a). The inner surface of the nuclear envelope of animal cells is bound by integral membrane proteins to a thin filamentous meshwork, called the nuclear lamina (Figure 12.7). The nuclear lamina provides mechanical support to the nuclear enve-

(c)

(bottom row) or a healthy subject (top row). The cells are stained for the protein lamin A (left column), for DNA (middle column), or shown in a living state under the phase-contrast light microscope (right column). The cell nucleus from the HGPS patient is misshapen due to the presence in the nuclear lamina of a truncated lamin A protein. (A: FROM H. MA, A. J. SIEGEL, AND R. BEREZNEY, J. CELL BIOL. 146:535, 1999, FIG. 2. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS; B: FROM U. AEBI, J. COHN, L. BUHLE, AND L. GERACE, NATURE 323:561, 1986. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED; C: FROM ANNA MATTOUT ET AL., COURTESY OF YOSEF GRUENBAUM, CURR. OPIN. CELL BIOL. 18:338, 2006, WITH PERMISSION FROM ELSEVIER.)

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

membranes arranged parallel to one another and separated by 10 to 50 nm (Figure 12.6). Together these membranes contain upwards of 60 distinct transmembrane proteins, including a number of species that link the outer nuclear membrane with elements of the cytoskeleton (Figure 12.6a). The inner and outer nuclear membranes are fused at sites forming circular pores that contain complex assemblies of proteins. The average mammalian cell contains several thousand nuclear pores. The outer membrane is generally studded with ribosomes and is continuous with the membrane of the rough endoplasmic

(a)

NPC

Ribosome

Intermediate Filaments

490

Chapter 12 Control of Gene Expression

lope, serves as a site of attachment for chromatin fibers at the nuclear periphery (Figure 12.6b), and has a poorly understood role in DNA replication and transcription. The filaments of the nuclear lamina are approximately 10 nm in diameter and composed of polypeptides, called lamins. Lamins are members of the same superfamily of polypeptides that assemble into the 10-nm intermediate filaments of the cytoplasm (see Table 9.2). The disassembly of the nuclear lamina prior to mitosis is induced by phosphorylation of the lamins. Mutations in one of the lamin genes (LMNA) are responsible for a number of diverse human diseases, including a rare form of muscular dystrophy (called EDMD2) in which muscle cells contain exceptionally fragile nuclei. Mutations in LMNA have also been linked to a disease, called Hutchinson-Gilford progeria syndrome (HGPS), that is characterized by premature aging and death during teenage years from heart attack or stroke. Figure 12.7c shows the misshapen nuclei from the cells of a patient with HGPS, demonstrating the importance of the nuclear lamina as a determinant of nuclear architecture. It is interesting to note that the phenotype depicted in Figure 12.7c has been traced to a synonymous mutation, that is, one that generated a different codon for the same amino acid. In this case, the change in DNA sequence alters the way the gene transcript is spliced, leading to production of a shortened protein, causing the altered phenotype. This example illustrates how the sequence of a gene serves as a “multiple code:” one that directs the translation machinery and others that direct the splicing machinery and protein folding. The Nuclear Pore Complex and Its Role in Nucleocytoplasmic Trafficking The nuclear envelope is the barrier between the nucleus and cytoplasm, and nuclear pores are the gateways across that barrier. Unlike the plasma membrane, which prevents passage of macromolecules between the cytoplasm and the extracellular space, the nuclear envelope is a hub of activity for the movement of RNAs and proteins in both directions between the nucleus and cytoplasm. The replication and transcription of genetic material within the nucleus require the participation of large numbers of proteins that are synthesized in the cytoplasm and transported across the nuclear envelope. Conversely, mRNAs, tRNAs, and ribosomal subunits that are manufactured in the nucleus must be transported through the nuclear envelope in the opposite direction. Some components, such as the snRNAs of the spliceosome (page 450), move in both directions; they are synthesized in the nucleus, assembled into RNP particles in the cytoplasm, and then shipped back to the nucleus where they function in mRNA processing. To appreciate the magnitude of the traffic between the two major cellular compartments, consider a HeLa cell, which is estimated to contain about 10,000,000 ribosomes. To support its growth, a single HeLa cell nucleus must import approximately 560,000 ribosomal proteins and export approximately 14,000 ribosomal subunits every minute. How do all of these materials pass through the nuclear envelope? In one early approach, a suspension of tiny gold particles was injected into cells and passage of the material through the nuclear envelope was observed with the electron microscope. As illustrated in Figure 12.8a,b, these particles

move from the cytoplasm into the nucleus by passing singlefile through the center of the nuclear pores. Electron micrographs of cells fixed in the normal course of their activities have also shown that particulate material can pass through a nuclear pore. An example is shown in Figure 12.8c, in which granular material presumed to consist of a ribosomal subunit is seen squeezing through one of these pores. Given the fact that materials as large as gold particles and ribosomal subunits can penetrate nuclear pores, one might assume that these pores are merely open channels, but just the opposite is true. Nuclear pores contain a doughnut-shaped structure called the nuclear pore complex (NPC) that straddles the nuclear envelope, projecting into both the cytoplasm and nucleoplasm. The NPC is a huge, supramolecular complex—15 to 30 times the mass of a ribosome—that exhibits

(a)

(b)

N

Cy (c)

Figure 12.8 The movement of materials through the nuclear pore. (a) Electron micrograph of the nuclear–cytoplasmic border of a frog oocyte taken minutes after injection with gold particles that had been coated with a protein normally found in the nucleus. These particles pass through the center of the nuclear pores (arrows) on their way from the cytoplasm to the nucleus. (b) At higher magnification, the gold particles are seen to be clustered in a linear array within each pore. (c) Electron micrograph of a section through the nuclear envelope of an insect cell showing the movement of granular material (presumed to be a ribosomal subunit) through a nuclear pore. (A,B: COURTESY OF C. M. FELDHERR; C: FROM BARBARA J. STEVENS AND HEWSON SWIFT, J. CELL BIOL. 31:72, 1966, FIG. 23. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

491

octagonal symmetry due to the eightfold repetition of a number of structures (Figure 12.10). Despite their considerable size and complexity, NPCs contain only about 30 different proteins, called nucleoporins, which are largely conserved between yeast and vertebrates. Each nucleoporin is present in multiple copies—8, 16, or 32 in number—in keeping with the octagonal symmetry of the structure. The structure of the NPC is seen in the electron micrographs of Figure 12.9 and the models of Figure 12.10. At the heart of the NPC is a central channel, which is surrounded by a ring of nucleoporins whose rearrangements can change the diameter of the opening from about 20 to 40 nm. The NPC is not a static structure, as evidenced by the finding that many of its component proteins are replaced with new copies over a time (a)

Cytoplasm

Cytoplasmic filaments Cytoplasmic ring

FG-repeat domains of FG nucleoporins

Outer nuclear membrane

Nuclear envelope Inner nuclear membrane Central channel

Central scaffold

(b)

Nuclear ring

Nucleoplasm NEL

Nuclear basket

(a) NE membrane protein ONM NE

Transmembrane nucleoporin ring

INM

FG nucleoporins (b)

(c)

Figure 12.9 Scanning electron micrographs of the nuclear pore complex from isolated nuclear envelopes of an amphibian oocyte. (a) The cytoplasmic face of the nuclear envelope showing the peripheral cytoplasmic ring of the nuclear pore complex. (b) The nuclear face of the nuclear envelope showing the basket-like appearance of the inner portion of the complex. (c) The nuclear face of the envelope showing the distribution of the NPCs and places where intact patches of the nuclear lamina (NEL) are retained. In all of these micrographs, isolated nuclear envelopes were fixed, dehydrated, dried, and metal-coated. (FROM M. W. GOLDBERG AND T. D. ALLEN, J. CELL BIOL. 119:1431, 1992, FIGURES 1–3. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

Figure 12.10 A model of a vertebrate nuclear pore complex (NPC). (a)Schematic representation of a vertebrate NPC as it is situated within the nuclear envelope. This elaborate structure consists of several parts, including a scaffold and transmembrane ring that anchors the complex to the nuclear envelope, a cytoplasmic and a nuclear ring, a nuclear basket, and eight cytoplasmic filaments. The FG-containing nucleoporins line a central channel with their disordered FG-containing domains extending into the opening and forming a hydrophobic meshwork. (b) Three-dimensional reconstruction of a portion of a nuclear pore complex showing the localization of individual nucleoporin molecules within the structure. The FG nucleoporins are shown in green. Several transmembrane nucleoporins span the pore membrane, forming an outer ring that anchors the NPC to the nuclear envelope. (B: FROM JAVIER FERNANDEZ-MARTINEZ AND MICHAEL P. ROUT, CURR. OPIN. CELL BIOL. 21: 604, 2009, WITH PERMISSION FROM ELSEVIER.)

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

Transmembrane ring

Chapter 12 Control of Gene Expression

492

period of seconds to minutes. Recent studies suggest that this dynamic exchange of nucleoporins may play a role in activating the transcription of chromatin that is associated with the NPC. Among the nucleoporins is a subset of proteins that possess, within their amino acid sequence, a large number of phenylalanine-glycine repeats (FG, by their single letter names). The FG repeats are clustered in a particular region of each molecule called the FG domain. Because of their unusual amino acid composition, the FG domains possess a disordered structure (page 57) that gives them an extended and flexible organization. The FG repeat-containing nucleoporins are thought to line the central channel of the NPC with their filamentous FG domains extending into the heart of the central channel. The FG domains form a hydrophobic meshwork or sieve that blocks the free diffusion of larger macromolecules (greater than about 40,000 Daltons) between the nucleus and cytoplasm. In 1982, Robert Laskey and his co-workers at the Medical Research Council of England found that nucleoplasmin, one of the more abundant nuclear proteins of amphibian oocytes, contains a stretch of amino acids near its C-terminus that functions as a nuclear localization signal (NLS). This sequence enables a protein to pass through the nuclear pores and enter the nucleus. The best studied, or “classical” NLSs, consist of one or two short stretches of positively charged amino acids. The T antigen encoded by the virus SV40, for example, contains an NLS identified as -Pro-Lys-Lys-LysArg-Lys-Val-. If one of the basic amino acids in this sequence is replaced by a nonpolar amino acid, the protein fails to localize to the nucleus. Conversely, if this NLS is fused to a nonnuclear protein, such as serum albumin, and injected into the cytoplasm, the modified protein becomes concentrated in the nucleus. Thus, targeting of proteins to the nucleus is similar in principle to trafficking of other proteins that are destined for segregation within a particular organelle, such as a mitochondrion or a peroxisome (page 316). In all of these cases, the proteins possess a specific “address” that is recognized by a specific receptor that mediates its transport into the organelle. The study of nuclear transport has been a very active area of research, driven by the development of in vitro systems capable of selectively importing proteins and RNPs into the nucleus. Using these systems, researchers have identified a family of proteins that function as mobile transport receptors, ferrying macromolecules across the nuclear envelope. Within this family, importins move macromolecules from the cytoplasm into the nucleus and exportins move macromolecules in the opposite direction. Figure 12.11a depicts some of the major steps that occur during the nuclear import of a protein, such as nucleoplasmin, that contains a classical NLS. Import begins as the NLScontaining cargo protein binds to a heterodimeric, soluble NLS receptor, called importin /, that resides in the cytoplasm (step 1, Figure 12.11a). The transport receptor is thought to escort the protein cargo to the outer surface of the nucleus where it likely docks with the cytoplasmic filaments that extend from the outer ring of the NPC (step 2). Figure 12.11b shows a number of gold particles bound to these filaments; these particles were coated with an NLS-containing nuclear protein that was being transported through the nu-

clear pore complex. The receptor–cargo complex then moves through the nuclear pore (step 3, Figure 12.11a) by engaging in a series of successive interactions with the FG domains of the FG-containing nucleoporins, allowing passage of the receptor–cargo complex through the NPC. Once the bound cargo proceeds through the NPC, a GTP-binding protein called Ran drives release of the transported protein into the nuclear compartment. Like other GTP-binding proteins, such as Sar1 (page 296) and EF-Tu (page 471) discussed in earlier chapters, Ran can exist in an active GTP-bound form or an inactive GDP-bound form. Ran’s role in regulating nucleocytoplasmic transport is based on a mechanism in which the cell maintains a high concentration of Ran-GTP in the nucleus and a very low concentration of Ran-GTP in the cytoplasm. The steep gradient of RanGTP across the nuclear envelope depends on the compartmentalization of certain accessory proteins (see Figure 15.21b for further discussion). One of these accessory proteins (named RCC1) is sequestered in the nucleus where it promotes the conversion of Ran-GDP to Ran-GTP, thus maintaining the high nuclear level of Ran-GTP. Another accessory protein (named RanGAP1) resides in the cytoplasm where it promotes the hydrolysis of Ran-GTP to Ran-GDP, thus maintaining the low cytoplasmic level of Ran-GTP. Thus the energy released by GTP hydrolysis is used to maintain the Ran-GTP gradient across the nuclear envelope. As discussed below, the Ran-GTP gradient drives nuclear transport by a process that depends only on receptor-mediated diffusion; no motor proteins or ATPases have been implicated. We can now return to our description of the classical NLS import pathway. When the importin–cargo complex arrives in the nucleus, it is met by a molecule of Ran-GTP, which binds to the complex and causes its disassembly as indicated in step 4, Figure 12.11a. This is the apparent function of the high level of Ran-GTP in the nucleus: it promotes the disassembly of complexes imported from the cytoplasm. The imported cargo is released into the nucleoplasm, and one portion of the NLS receptor (the importin subunit) is shuttled back to the cytoplasm together with the bound Ran-GTP (step 5). Once in the cytoplasm, the GTP molecule bound to Ran is hydrolyzed, releasing Ran-GDP from the importin subunit. Ran-GDP is returned to the nucleus, where it is converted back to the GTP-bound state for additional rounds of activity. Importin is transported back to the cytoplasm by one of the exportins. Ran-GTP plays a key role in the escort of macromolecules out of the nucleus, just as it does in their import from the cytoplasm. Recall that Ran-GTP is essentially confined to the nucleus. Whereas Ran-GTP induces the disassembly of imported complexes, as shown in step 4 of Figure 12.11a, RanGTP promotes the assembly of export complexes. Proteins exported from the nucleus contain amino acid sequences (called nuclear export signals, or NESs) that are recognized by transport receptors that carry them through the nuclear envelope to the cytoplasm. RNA Transport Most of the traffic out of the nucleus consists of various types of RNA molecules—mRNAs, rRNAs,

493

Nucleoplasm Ran-GTP Exportin

Cytoplasm

Ran-GTP

Importin β

Ran-GDP

Importin α

NLS protein

2

1

3

4

5

(a)

N

C

CF

(b)

? snoRNAs, miRNAs, and tRNAs—that are synthesized in the nucleus and function in the cytoplasm or are modified in the cytoplasm and return to function in the nucleus. These RNAs move through the NPC as ribonucleoproteins (RNPs). As with protein transport, RNA transport involves the association of transport receptors that ferry mRNP complexes through nuclear pores. Transport of an mRNP from the nucleus to cytoplasm is associated with extensive remodeling; certain proteins are stripped from the mRNA, while others are added to the complex. Numerous studies have demonstrated a functional link between pre-mRNA splicing and mRNA ex-

port; only mature (i.e., fully processed) mRNAs are capable of nuclear export. If an mRNA still contains an unspliced intron, that RNA is retained in the nucleus.

Chromosomes and Chromatin Chromosomes seem to appear out of nowhere at the beginning of mitosis and disappear once again when cell division has ended. The appearance and disappearance of chromosomes provided early cytologists with a challenging question: What is the nature of the chromosome in the nonmitotic cell?

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

NP

Figure 12.11 Importing proteins from the cytoplasm into the nucleus. (a) Proposed steps in nuclear protein import. Proteins bearing a nuclear localization signal (NLS) bind to the heterodimeric receptor (importin ␣/␤) (step 1) forming a complex that associates with a cytoplasmic filament (step 2). The receptor-cargo complex moves through the nuclear pore (step 3) and into the nucleoplasm where it interacts with RanGTP and dissociates (step 4). The importin ␤ subunit, in association with Ran-GTP, is transported back to the cytoplasm, where the Ran-GTP is hydrolyzed (step 5). Ran-GDP is subsequently transported back to the nucleus, where it is converted to Ran-GTP. Conversely, importin ␣ is transported back to the cytoplasm. (b) Nucleoplasmin is a protein present in high concentration in the nucleoplasm of Xenopus oocytes. When gold particles are coated with nucleoplasmin and injected into the cytoplasm of a Xenopus oocyte, they are seen to bind to the cytoplasmic filaments (CF) projecting from the outer ring of the nuclear pore complex. Several particles are also seen in transit through the pore (NP) into the nucleus. (A): BASED ON A MODEL BY M. OHNO ET AL., CELL 92:327, 1998; CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER. B: FROM W. D. RICHARDSON ET AL., CELL 52:662, 1988; WITH PERMISSION FROM ELSEVIER. COURTESY OF A. D. MILLS.)

494

Packaging the Genome An average human cell contains about 6.4 billion base pairs of DNA divided among 46 chromosomes (the value for a diploid number of chromosomes). Each chromosome contains a single, continuous DNA molecule; the larger the chromosome, the more DNA it contains. The largest human chromosome contains 苲0.3 billion base pairs. Given that each base pair is 0.34 nm in length, 0.3 billion base pairs would constitute a DNA molecule from just one chromosome of greater than 10 cm long. How is it possible to fit all 46 chromosomes into a nucleus that is only 10 m (1 105 m) in diameter and, at the same time, maintain the DNA in a state that is accessible to enzymes and regulatory proteins? Just as important, how is the single DNA molecule of each chromosome organized so that it does not become hopelessly tangled with other chromosomes? The answers lie in the remarkable manner in which a DNA molecule is packaged in eukaryotic cells. Nucleosomes: The Lowest Level of Chromosome Organization Chromosomes are composed of DNA and associated proteins, together called chromatin. The orderly packaging of eukaryotic DNA depends on histones, a remarkable group of small proteins that possess an unusually high content of the basic amino acids arginine and lysine. Histones are divided into five classes, which can be distinguished by their arginine/lysine ratio (Table 12.1). The amino acid sequences of histones, particularly H3 and H4, have undergone very little change over long periods of evolutionary time. The H4 histones of both peas and cows, for example, contain 102 amino acids, and their sequences differ at only 2 amino acid residues. Why are histones so highly conserved? One reason is the positively charged histones interact with the negatively charged backbone of the DNA molecule, which is identical in all organisms. In addition, nearly all of the amino acids in a histone molecule are engaged in an interaction with another molecule, either DNA or another histone. As a result, very few

Table 12.1 Calf Thymus Histones

Histone

H1 H2A H2B H3 H4

Number of residues

Mass (kDa)

%Arg

%Lys

UEP* (ⴛ 10ⴚ6 year)

215 129 125 135 102

23.0 14.0 13.8 15.3 11.3

1 9 6 13 14

29 11 16 10 11

8 60 60 330 600

*Unit evolutionary period: the time for a protein’s amino acid sequence to change by 1 percent after two species have diverged.

amino acids in a histone can be replaced with other amino acids without severely affecting the function of the protein. In the early 1970s, it was found that when chromatin was treated with nonspecific nucleases, most of the DNA was converted to fragments of approximately 200 base pairs in length. In contrast, a similar treatment of naked DNA (i.e., DNA devoid of proteins) produced a randomly sized population of fragments. This finding suggested that chromosomal DNA was protected from enzymatic attack, except at certain periodic sites along its length. It was presumed that the proteins associated with the DNA were providing the protection. In 1974, using data from nuclease digestion and other types of information, Roger Kornberg proposed an entirely new structure for chromatin. Kornberg proposed that DNA and histones are organized into repeating subunits, called nucleosomes. We now know that each nucleosome contains a nucleosome core particle consisting of 146 base pairs of supercoiled DNA (page 397) wrapped almost twice around a diskshaped complex of eight histone molecules (Figure 12.12a). The histone core of each nucleosome consists of two copies each of histones H2A, H2B, H3, and H4 assembled into an octamer (Figure 12.13a). The remaining histone—type H1—

Chapter 12 Control of Gene Expression

Histone octamer

H1

DNA (a)

Figure 12.12 Nucleosomal organization of chromatin. (a) Schematic diagram showing the structure of a nucleosome core particle and an associated histone H1 molecule. The core particle itself consists of approximately 1.8 turns (146 base pairs) of negatively supercoiled DNA wrapped around eight core histone molecules (two each of H2A, H2B, H3, and H4). The H1 linker histone binds near the sites where

(b)

DNA enters and exits the nucleosome. Two alternate positions of the H1 molecule are shown. (b) Electron micrograph of chromatin fibers released from the nucleus of a Drosophila cell. The nucleosome core particles are approximately 10 nm in diameter and are connected by short strands of naked linker DNA, which are approximately 2 nm in diameter. (B: COURTESY OF OSCAR L. MILLER, UNIVERSITY OF VIRGINIA.)

495

resides outside the nucleosome core particle. The H1 histone is referred to as a linker histone because it binds to part of the linker DNA that connects one nucleosome core particle to the next. Fluorescence studies indicate that H1 molecules continuously dissociate and reassociate with chromatin. Together the H1 protein and the histone octamer interact with about 168 base pairs of DNA. H1 histone molecules can be selectively removed from the chromatin fibers by subjecting the preparation to solutions of low ionic strength. When H1-depleted chromatin is observed under the electron microscope, the nucleosome core particles and naked linker DNA can be seen as separate elements, which together appear like “beads on a string” (Figure 12.12b). Our understanding of DNA packaging has been greatly advanced by portraits of the nucleosome core particle obtained by X-ray crystallography. The eight histone molecules that comprise a nucleosome core particle are organized into four heterodimers: two H2A-H2B dimers and two H3-H4 dimers (Figure 12.13a,b). Dimerization of histone molecules is mediated by their C-terminal domains, which consist largely of helices (represented by the cylinders in Figure 12.13c) folded into a compact mass in the core of the nucleosome. In contrast, the N-terminal segments from each core histone (and also the

C-terminal segment of H2A) takes the form of a long, flexible tail (represented by the dashed lines of Figure 12.13c) that extends out through the DNA helix that is wrapped around the core particle. For many years, histones were thought of as inert, structural molecules, but the extending N-terminal segments are targets of key enzymes that play a role in making the chromatin accessible to proteins. In this way, chromatin is a dynamic cellular component in which histones, regulatory proteins, and a variety of enzymes move in and out of the nucleoprotein complex to facilitate the complex tasks of DNA transcription, compaction, replication, recombination, and repair. Histone modification is not the only mechanism that alters the histone character of nucleosomes. In addition to the four “conventional” core histones discussed above, several alternate versions of the H2A and H3 histones are also synthesized in most cells. The importance of these histone variants, as they are called, remains largely unexplored, but they are thought to have specialized functions. The localization and apparent function of one of these variants, CENP-A, is discussed on page 509. Another variant, H2A.X, is distributed throughout the chromatin, where it replaces conventional H2A in a fraction of the nucleosomes. H2A.X becomes phosphorylated at sites of DNA-strand breakage and may play a

N

0

C

␣N

␣N 1

H3'H3

6 C

H4

H3

␤

N

2

H4 H2B

H2A H2A

5

␣C

H2B

3 N

H3 N

4

H4

H2A H2B (a)

H3 H4

(b)

Figure 12.13 The structure of a nucleosome. (a) Schematic representation of a nucleosome core particle with its histone octamer composed of four histone heterodimers (two H3/H4 dimers and two H2A/H2B dimers). (b) X-ray crystallographic structure of a nucleosome core particle viewed down the central axis of the DNA superhelix, showing the position of each of the eight histone molecules of the core octamer. The histones are organized into four dimeric complexes. Each histone dimer binds 27 to 28 base pairs of DNA, with contacts occurring where the minor groove of the DNA faces the histone core. (c) A simplified, schematic model of half of a nucleosome core particle, showing one turn (73 base pairs) of the DNA superhelix and four core histone molecules. The four different histones are shown in separate colors, as indicated by the key. Each core histone is seen to consist of (1) a globular region, called the “histone fold,” consisting of

H2A

H3

H2B

H4

(c)

three helices (represented by the cylinders) and (2) a flexible, extended N-terminal tail (indicated by the letter N) that projects out of the histone disk and out past the DNA double helix. The intermittent points of interaction between the histone molecules and the DNA are indicated by white hooks. The dashed lines indicate the outermost portion of the histone tails, which are sites of modification. These flexible tails lack a defined tertiary structure and therefore do not appear in the X-ray structure shown in part b. (A: FROM C. DAVID ALLIS, ET AL., EPIGENETICS, FIGURE 3.5, P. 30, COPYRIGHT 2007. REPRINTED WITH PERMISSION FROM COLD SPRING HARBOR LABORATORY PRESS; B: FROM KAROLIN LUGER AND TIMOTHY J. RICHMOND, ET AL., NATURE 389:251, 1997, FIG. 1. REPRINTED BY PERMISSION OF MACMILLAN PUBLISHERS LIMITED; C: AFTER DRAWING BY D. RHODES; © 1997, BY MACMILLAN PUBLISHERS LIMITED.)

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

7

496

role in recruiting the enzymes that repair the DNA. Two other core histone variants—H2A.Z and H3.3—have been implicated in numerous activities, including the activation of transcription, but their roles are debated. We began this section by wondering how a nucleus 10 m in diameter can pack 200,000 times this length of DNA within its boundaries. The assembly of nucleosomes is the first important step in the compaction process. With a nucleotide–nucleotide spacing of 0.34 nm, the 200 base pairs of a single 10-nm nucleosome would stretch nearly 70 nm if fully extended. Consequently, it is said that the packing ratio of the DNA of nucleosomes is approximately 7:1. Higher Levels of Chromatin Structure A DNA molecule wrapped around nucleosome core particles of 10-nm diameter is the lowest level of chromatin organization. Chromatin does not, however, exist within the cell in this relatively extended, “beads-on-a-string” state. When chromatin is released from nuclei and prepared at physiologic ionic strength, a fiber of approximately 30-nm thickness is observed (Figure 12.14a). Despite more than two decades of investigation, the structure of the 30-nm fiber remains a subject of debate. Two models in which the nucleosomal filament is coiled into the higher-order, thicker fiber are shown in Figure 12.14b,c. The models differ in the relative positioning of nucleosomes within the

fiber. Recent research favors the “zig-zag” model depicted in Figure 12.14b, in which successive nucleosomes along the DNA are arranged in different stacks and alternating nucleosomes become interacting neighbors. Regardless of how it is accomplished, the assembly of the 30-nm fiber increases the DNA-packing ratio an additional 6-fold, or about 40-fold altogether. Maintenance of the 30-nm fiber depends on interactions between histone molecules of neighboring nucleosomes. Linker histones and core histones have both been implicated in higher-order packaging of chromatin. If, for example, H1 linker histones are selectively extracted from compacted chromatin, the 30-nm fibers uncoil to form the thinner, more extended beaded filament shown in Figure 12.12b. Adding back H1 histone leads to restoration of the higher-order structure. Core histones of adjacent nucleosomes may interact with one another by means of their long, flexible tails. Structural studies indicate, for example, that the N-terminal tail of an H4 histone from one nucleosome core particle can reach out and make extensive contact with both the linker DNA between nucleosome particles and the H2A/H2B dimer of adjacent particles. These types of interactions are thought to mediate the folding of the nucleosomal filament into a thicker fiber. In fact, chromatin fibers prepared with H4 histones that lack their tails are unable to fold into higher-order fibers.

Zig-zag

Solenoid

Chapter 12 Control of Gene Expression

30nm Fiber

11 nm

30 nm fiber (a)

(b)

30 nm fiber (c)

Figure 12.14 The 30-nm fiber. (a) Electron micrograph of a 30-nm chromatin fiber released from a nucleus following lysis of the cell in a hypotonic salt solution. (b) In the “zig-zag” model, the linker DNA is present in a straight, extended state that criss-crosses back and forth between consecutive core particles, which are organized into two separate stacks of nucleosomes. The lower portion of the figure shows how the two stacks of nucleosomes are coiled into a higher-order helical structure. (c) In the “solenoid” model, the linker DNA gently curves as it connects consecutive core particles, which are organized into a single, continuous helical array containing about 6–8 nucleosomes per turn. In these models, the histone octamer is shown in orange, the DNA in blue, and the linker H1 histone in yellow. (A: COURTESY OF BARBARA HAMKALO AND JEROME B. RATTNER; B,C: FROM SEPIDEH KHORASANIZADEH, CELL 116:262, FIGURE 3A, 3B, 2004, WITH PERMISSION FROM ELSEVIER.)

497

The next stage in the hierarchy of DNA packaging is thought to occur as the 30-nm chromatin fiber is gathered into a series of large, supercoiled loops, or domains, that may be compacted into even thicker (80–100 nm) fibers. The DNA loops are apparently tethered at their bases to proteins that may be part of a poorly defined nuclear scaffold. Normally, loops of chromatin fibers are spread out within the nucleus and cannot be visualized, but their presence can be revealed under certain

circumstances. For instance, when isolated mitotic chromosomes are treated with solutions that extract histones, the histone-free DNA can be seen to extend outward as loops from a protein scaffold (Figure 12.15a). Among the proteins thought to play a key role in maintaining these DNA loops is cohesin, which is best known for its role in holding replicated DNA molecules together during mitosis (Section 14.2). Cohesin is a ring-shaped protein that may act as shown in Figure 12.15b. The mitotic chromosome represents the ultimate in chromatin compactness; 1 ␮m of mitotic chromosome length typically contains approximately 1 cm of DNA, which represents a packing ratio of 10,000:1. This compaction occurs by a poorly understood process that is discussed in Section 14.2. An overview of the various levels of chromatin organization, from the nucleosomal filament to a mitotic chromosome, is depicted in Figure 12.16.

DNA double helix (2 nm in diameter)

DNA

H1 histone

Scaffold

Nucleosome filament (10 nm in diameter)

(a)

Cohesin ring

30 nm fiber Protein scaffold

Looped domains

(b)

Figure 12.15 Chromatin loops: a higher level of chromatin structure. (a) Electron micrograph of a mitotic chromosome that had been treated to remove histones. The histone-depleted chromosome displays loops of DNA that are attached at their bases to a residual protein scaffold. (b) A simplified model by which rings of cohesion could play a role in maintaining loops of interphase DNA. (A: FROM JAMES R. PAULSON AND U. K. LAEMMLI, CELL 12:823, 1977, WITH PERMISSION OF ELSEVIER.)

Metaphase chromosome

Figure 12.16 Levels of organization of chromatin. Naked DNA molecules are wrapped around histones to form nucleosomes, which represent the lowest level of chromatin organization. Nucleosomes are organized into 30-nm fibers, which in turn are organized into looped domains. When cells prepare for mitosis, the loops become further compacted into mitotic chromosomes (see Figure 14.13).

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

Nucleosome core particle Core histones (8 subunits)

498

Chapter 12 Control of Gene Expression

Heterochromatin and Euchromatin After mitosis has been completed, most of the chromatin in highly compacted mitotic chromosomes returns to its diffuse interphase condition. Approximately 10 percent of the chromatin, however, generally remains in a condensed, compacted form throughout interphase. This compacted, densely stained chromatin is typically concentrated near the periphery of the nucleus, often in proximity with the nuclear lamina, as indicated in Figure 12.5a. Chromatin that remains compacted during interphase is called heterochromatin to distinguish it from euchromatin, which returns to a dispersed, active state. When a radioactively labeled RNA precursor such as [3H]uridine is given to cells that are subsequently fixed, sectioned, and autoradiographed, the clumps of heterochromatin remain largely unlabeled, indicating that they have relatively little transcriptional activity. The peripheral regions of the nucleus are thought to contain factors that promote transcriptional repression, making it a favorable environment for the localization of heterochromatin. As discussed later, the state of a particular region of the genome, whether it is euchromatic or heterochromatic, is stably inherited from one cell generation to the next. Heterochromatin is divided into two classes. Constitutive heterochromatin remains in the compacted state in all cells at all times and, thus, represents DNA that is permanently silenced. In mammalian cells, the bulk of the constitutive heterochromatin is found in regions that flank the telomeres and centromere of each chromosome and in a few other sites, such as the distal arm of the Y chromosome in male mammals. The DNA of constitutive heterochromatin consists primarily of repeated sequences (page 401) and con-

(a)

(b)

Figure 12.17 Facultative heterochromatin: the inactive X chromosome. (a) The inactivated X chromosome appears as a darkly staining heterochromatic structure, called a Barr body (arrows). (b) Random inactivation of either X chromosome in different cells during early embryonic development creates a mosaic of tissue patches and is responsible for the color patterns in calico cats. Each patch comprises the descendants of one cell that was present in the embryo at the time of inactivation. These patches are visually evident in calico cats, which are heterozygotes with an allele for black coat color residing on one X chromosome and an allele for orange coat color on the other X. This

tains relatively few genes. In fact, when genes that are normally active move into a position adjacent to these regions as a result of transposition or translocation, they tend to become transcriptionally silenced, a phenomenon known as the position effect. It is thought that heterochromatin contains components whose influence can spread outward a certain distance, affecting nearby genes. The spread of heterochromatin along the chromosome is apparently blocked by specialized barrier sequences (boundary elements) in the genome. Constitutive heterochromatin also serves to inhibit genetic recombination between homologous repetitive sequences. This type of recombination can lead to DNA duplications and deletions (see Figure 10.22). Facultative heterochromatin is chromatin that has been specifically inactivated during certain stages of an organism’s life or in certain types of differentiated cells (as in Figure 17.9b). An example of facultative heterochromatin can be seen by comparing the sex chromosomes in the cells of a female mammal to those of a male. The cells of males have a tiny Y chromosome and a much larger X chromosome. Because the X and Y chromosomes have only a few genes in common, males have a single copy of most genes that are carried on the sex chromosomes. Although cells of females contain two X chromosomes, only one of them is transcriptionally active. The other X chromosome remains condensed as a heterochromatic clump (Figure 12.17a) called a Barr body after the researcher who discovered it in 1949. Formation of a Barr body ensures that the cells of both males and females have the same number of active X chromosomes and thus synthesize equivalent amounts of the products encoded by X-linked genes.

(c)

explains why male calico cats are virtually nonexistent: because all cells in the male have either the black or orange coat color allele. (The white spots on this cat are due to a different, autosomal coat color gene.) (c) This kitten was cloned from the cat shown in b. The two animals are genetically identical but have different coat patterns, a reflection of the random nature of the X inactivation process (and likely other random developmental events). (A: COURTESY OF E.G. (MIKE) BERTRAM; B,C: COURTESY OF COLLEGE OF VETERINARY MEDICINE AND BIOMEDICAL SCIENCES, TEXAS A&M UNIVERSITY.)

499

X Chromosome Inactivation Based on her studies of the inheritance of coat color in mice, the British geneticist Mary Lyon proposed the following in 1961: 1. Heterochromatinization of the X chromosome in female

The Lyon hypothesis was soon confirmed.2 Because maternally and paternally derived X chromosomes may contain different alleles for the same trait, adult females are in a sense genetic mosaics, where different alleles function in different cells. X-chromosome mosaicism is reflected in the patchwork coloration of the fur of some mammals, including calico cats (Figure 12.17b,c). Pigmentation genes in humans are not located on the X chromosome, hence the absence of “calico women.” Mosaicism due to X inactivation can be demonstrated in women, nonetheless. For example, if a narrow beam of red or green light is shone into the eyes of a woman who is a heterozygous carrier for red-green color blindness, patches of retinal cells with defective color vision can be found interspersed among patches with normal vision. The mechanism responsible for X inactivation has been a focus of attention since a 1992 report suggesting that inactivation is initiated by a noncoding RNA molecule—rather than a protein—that is transcribed from one of the genes (called XIST in humans) on the X chromosome that becomes inactivated (the Xi). The XIST RNA is a large transcript (over 17 kb long), which distinguishes it from many other noncoding 2

The random inactivation of X chromosomes discussed here, which occurs after the embryo implants in the uterus, is actually the second wave of X chromosome inactivation to occur in the embryo. The first wave, which occurs very early in development, is not random but rather leads only to the inactivation of X chromosomes that had been donated by the father. This early inactivation of paternal X chromosomes is maintained in the cells that give rise to extraembryonic tissues (e.g., the placenta) and is not discussed in the text. Early paternal X inactivation is erased in cells that give rise to embryonic tissue and random X inactivation subsequently occurs.

The Histone Code and Formation of Heterochromatin Figure 12.13c shows a schematic model of the nucleosome core particle with its histone tails projecting outward. But this is only a general portrait that obscures important differences among nucleosomes. Cells contain a remarkable array of enzymes that are able to add chemical groups to or remove them from specific amino acid residues on the histone tails. Those residues that are subject to modification, most notably by methylation, acetylation, phosphorylation, or ubiquitination, are indicated by the colored bars in Figure 12.18. The past few years has seen the emergence of a hypothesis known as the histone code, which postulates that the state and activity of a particular region of chromatin depend on the specific modifications, or combinations of modifications, to the histone tails in that region. In other words, the pattern of modifications adorning the tails of the core histones contains encoded information governing the properties of those nucleosomes. Studies suggest that histone tail modifications act in two ways to influence chromatin structure and function. 1. The modified residues serve as docking sites to recruit a

specific array of nonhistone proteins, which then determine the properties and activities of that segment of chromatin. A sampling of some of the specific proteins that bind selectively to modified histone residues is depicted in Figure 12.19. Each of the proteins bound to the histones in Figure 12.19 is capable of modulating some aspect of chromatin activity or structure. 2. The modified residues alter the manner in which the histone tails of neighboring nucleosomes interact with one another or with the DNA to which the nucleosomes are bound. Changes in these types of interactions can lead to changes in the higher order structure of chromatin. Acetylation of the lysine residue at position 16 on histone H4, for example, interferes with its interaction with a histone H2A molecule of an adjacent nucleosome, which in turn inhibits formation of the compact 30-nm chromatin fiber. 3

Approximately 15 percent of genes on the chromosome escape inactivation by an unknown mechanism. The “escapees” include genes that are also present on the Y chromosome, which ensures that they are expressed equally in both sexes.

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

mammals occurs during early embryonic development and leads to the inactivation of the genes on that chromosome. 2. Heterochromatinization in the embryo is a random process in the sense that the paternally derived X chromosome and the maternally derived X chromosome stand an equal chance of becoming inactivated in any given cell. Consequently, at the time of inactivation, the paternal X can be inactivated in one cell of the embryo, and the maternal X can be inactivated in a neighboring cell. Once an X chromosome has been inactivated, its heterochromatic state is transmitted through many cell divisions, so that the same X chromosome is inactive in all the descendants of that particular cell. The process of X-chromosome inactivation is best studied in vitro during differentiation of mouse embryonic stem (ES) cells. ES cells are derived from very early embryos (page 21) and thus have two active X chromosomes. 3. Reactivation of the heterochromatinized X chromosome occurs in female germ cells prior to the onset of meiosis. Consequently, both X chromosomes are active during oogenesis, and all of the gametes receive a euchromatic X chromosome.

RNAs that tend to be quite small. Because of its size, XIST is described as a long noncoding RNA (or lncRNA) and represents only the first in a long and growing list of lncRNAs that have been discovered as regulatory factors in cellular activities. In fact, at least seven distinct lncRNAs are now thought to be involved in X-chromosome inactivation; we will restrict the present discussion to XIST. The XIST RNA selectively accumulates along the length of the X chromosome from which it is transcribed, where it helps recruit certain protein complexes that inactivate the genes at selected sites.3 The XIST gene is required to initiate inactivation, but not to maintain it from one cell generation to the next. This conclusion is based on the discovery of tumor cells in certain women that contain an inactivated X chromosome whose XIST gene is deleted. X inactivation is thought to be maintained by DNA methylation (page 531) and repressive histone modifications, as discussed in the next section.

500 Ac

Ac Me P

Me Me

Me Ac

Ac

MeMe P

Me

Me

A R T K Q TA R K S T G G K A P R K Q L AT K A A R K S A PAT G G V K K P H 2

4

9 10

Ac

P Me

14

Ac

1718

Ac

23

Ac

262728

36

3

5

8

12

Ac

P

5

Ub

histone H2A

9

Ub

Ac P

5

12

Phosphorylation

129 aa

119

histone H2B

P E PA K S A PA P K K G S K K AV T K A Q K K D S K K R K R S R K E S Y S V

Chapter 12 Control of Gene Expression

102 aa

Ac

Ac

P

histone H4

20

16

SGRGKQGGKARAKAKSRSSRAGLQFPVGRVHRLLRKGNY 1

135 aa

Me

S G R G K G G K G L G K G G A K R H R K V L R D N I Q G I T K PA I R R L A R 1

histone H3 79

Ac

14

125 aa 120

Acetylation

Me

Methylation (arginine)

Me

Methylation (active lysine)

Me

Methylation (repressive lysine)

Ub

Ubiquitination

Figure 12.18 Histone modifications and the histone code. The amino terminal tails from histone proteins that extend out past the DNA in nucleosomes can be enzymatically modified by the covalent addition of methyl, acetyl, and phosphate groups. Methyl groups are added to either lysine (K) or arginine (R) residues, acetyl groups to lysine residues, and phosphate groups to serine (S) residues. The small protein ubiquitin can be added to one of the lysine residues in the core (rather than the tail) of H2A and H2B. Each lysine residue can have one, two, or three added methyl groups, and each arginine residue can have one or two added methyl groups. These modifications affect the affinity of the histone for interacting proteins that control the transcriptional activity of chromatin, which has led to the concept of a histone code. It is widely found that

acetylation of lysines leads to transcriptional activation. The effects of methylation depend strongly on which of these residues is modified. For example, methylation of lysine 9 of histone H3 (i.e., H3K9) is typically present in heterochromatin and associated with transcriptional repression, as discussed in the text. Methylation of H3K27 and H4K20 is also strongly associated with transcriptional repression, whereas methylation of H3K4 and H3K36 is associated with transcriptional activation. Just as there are enzymes that catalyze each of these modifications, there are also enzymes (deacetylases, demethylases, phosphatases, and deubiquitinases) that specifically remove them. (FROM C. DAVID ALLIS ET AL., EPIGENETICS, FIGURE 3.6, PAGE 31, COPYRIGHT 2007. REPRINTED WITH PERMISSION FROM COLD SPRING HARBOR PRESS).

For the moment, we will restrict the discussion to the formation of heterochromatin as it occurs, for example, during X chromosome inactivation. For the sake of simplicity, we will focus on modification of a single residue—lysine 9 of H3— which will illustrate the general principles by which cells uti-

lize the histone code. The actions of several other histone modifications are indicated in Figure 12.18 and discussed in the accompanying legend, and another example is described on page 527. As we will see throughout this chapter, techniques have been developed in recent years to analyze changes

H3

BPTF CHD1 ING2

HP1

14-3-3

Rsc4

PC

EAF3

CRB2 JMJD2

Me

Me

P

Ac

Me

Me

Me

K 4

K 9

S 10

K 14

K 27

K 36

K 20

Brd2

Ac K 16 12 8 K K Ac Ac

H4

Taf1 Bdf1

Figure 12.19 Examples of proteins that bind selectively to modified H3 or H4 residues. Each of the bound proteins possesses an activity that alters the structure and/or function of the chromatin. There is an added complexity that is not shown in this drawing in that modifications at one histone residue can influence events at other residues, a phenomenon known as cross-talk. For example, the binding

of the heterochromatin protein HP1 to H3K9 is blocked by phosphorylation of the adjacent |serine residue (H3S10), which typically occurs during mitosis. (FROM T. KOUZARIDES, CELL 128:696, 2007; CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

501

Figure 12.20 Experimental demonstration of a correlation between transcriptional activity and histone acetylation. This metaphase chromosome spread has been labeled with fluorescent antibodies to acetylated histone H4, which fluoresce green. All of the chromosomes except the inactivated X stain brightly with the antibody against the acetylated histone. (FROM P. JEPPESEN AND B. M. TURNER, COVER OF CELL VOL. 74, NO. 2, 1993, WITH PERMISSION FROM ELSEVIER. COURTESY PETER JEPPESEN.)

maintenance of heterochromatin. Once bound to an H3 tail, HP1 is thought to interact with other proteins, including (1) SUV39H1, the enzyme responsible for methylating the H3K9 residue and (2) other HP1 molecules on nearby nucleosomes. These binding properties of the HP1 protein promote the formation of an interconnecting network of methylated nucleosomes, leading to a compact, higher-order chromatin domain (see Figure 12.21). Most importantly, this state is transmitted through cell divisions from one cell generation to the next (discussed on page 510). Histone modifications present in the parent cell must somehow instruct the formation of the same modifications on newly deposited nucleosomes in the daughter cells. The mechanism by which this occurs is poorly understood. Studies in a number of organisms indicate that small RNAs, similar in nature to those involved in RNA interference (page 457), play an important role in targeting a particular region of the genome to undergo H3K9 methylation and subsequent heterochromatinization. A model showing the types of events thought to take place during heterochromatin assembly is depicted in Figure 12.21. RNAs derived from highly repetitive elements like centromeres or from regions harboring transposable elements are processed by the RNA interference pathway and function to recruit histone modification enzymes, including histone methyltransferases that modify lysine 9 on histone H3. Modification in this manner recruits members of the HP1 chromodomain protein family, which leads to heterochromatin formation and gene silencing. This is particularly important for the silencing of mobile transposable elements and possibly viral DNA. If, components of the RNAi machinery are deleted, methylation of H3K9 and heterochromatinization is impaired, allowing activation of mobile DNA elements. In addition to its role in the formation of heterochromatin, the RNAi machinery may also function in the maintenance of the heterochromatic state from one cell generation to the next. These findings point to yet another role, on a rapidly growing list of roles, performed by noncoding RNAs. It would not be surprising, given the fact that most of the genome is now thought to be transcribed (page 461), to discover that RNAs play a major role in guiding many of the changes in chromatin structure that are known to occur during embryonic development or in response to physiological stimuli. The Structure of a Mitotic Chromosome The relatively dispersed state of the chromatin of an interphase cell favors interphase activities, such as replication and transcription. In contrast, the chromatin of a mitotic cell exists in its most highly condensed state, which facilitates the delivery of DNA to each daughter cell. When a chromosome undergoes compaction during mitotic prophase, it adopts a distinct and predictable shape determined primarily by the length of the DNA molecule in that chromosome and the position of the centromere (discussed later). The mitotic chromosomes of a dividing cell can be displayed by the technique depicted in Figure 12.22a. In this technique, a dividing cell is broken open, and the mitotic chromosomes from the cell’s nucleus settle and attach to the surface of the slide over a very small

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

that affect genome transcription, such as histone modifications, on a genome-wide level, rather than simply looking at these changes one gene at a time. This has given us a much broader view of the general importance of each of these phenomena than was possible only a few years ago. Comparison of the nucleosomes present within heterochromatic versus euchromatic chromatin domains revealed a striking difference. The lysine residue at the #9 position (Lys9 or K9) of histone H3 in heterochromatic domains is largely methylated, whereas this same residue in euchromatic domains is often acetylated. Removal of the acetyl groups from H3 and H4 histones is among the initial steps in conversion of euchromatin into heterochromatin. The correlation between transcriptional repression and histone deacetylation can be seen by comparing the inactive, heterochromatic X chromosome of female cells, which contains deacetylated histones, to the active, euchromatic X chromosome, whose histones exhibit a normal level of acetylation (Figure 12.20). Histone deacetylation is accompanied by methylation of H3K9, which is catalyzed by an enzyme (a histone methyltransferase) that appears to be dedicated to this, and only this, particular function. This enzyme, called SUV39H1 in humans, can be found localized within heterochromatin, where it may stabilize the heterochromatic nature of the region through its methylation activity. The formation of a methylated lysine at the #9 position endows the histone H3 tail with an important property: it becomes capable of binding with high affinity to proteins that contain a particular domain, called a chromodomain. The human genome contains at least 30 proteins with chromodomains, the best studied of which is heterochromatic protein 1 (or HP1). HP1 has been implicated in the formation and

502 Figure 12.21 A model in which small RNAs govern the formation of heterochromatin. Recent studies suggest that noncoding RNAs play a role in heterochromatin formation. In this model, RNAs are transcribed from both strands of repetitive DNA sequences (step 1). The RNAs form double-stranded molecules (step 2) that are processed by the endonuclease Dicer and other components of the RNAi machinery (page 457) to form a single-stranded siRNA guide and an associated protein complex (step 3). In step 4, the siRNA-protein complex has bound to a complementary segment of a nascent RNA and has recruited the histone methyltransferase SUV39H1. This leads to addition of methyl groups to the K9 residue of histone H3, replacing acetyl groups that were previously linked to the H3K9 residues. (The acetyl groups, which are characteristic of transcribed regions of euchromatin, are removed enzymatically from the lysine residues by a histone deacetylase, which is not shown.) In step 5, the acetyl groups have all been replaced by methyl groups, which serve as binding sites for the HP1 protein (step 6). The boundary element in the DNA prevents the spread of heterochromatinization into adjacent regions of chromatin. Once HP1 has bound to the histone tails, the chromatin can be packaged into higher-order, more compact structures by means of interaction between HP1 protein molecules (step 7). The enzyme SUV39H1 can also bind to methylated histone tails (not shown) so that additional nucleosomes can become methylated. In step 8, a region of highly compacted heterochromatin has been formed.

Transcription 1

Repeated DNA Transcription Transcribed RNA

dsRNA 2

Protein complex

Dicer siRNA

3

SUV39H1 Nascent RNA

4

Acetyl group on K9 of H3

siRNA RNA polymerase Methyl group on K9 of H3

5

HP1

Chapter 12 Control of Gene Expression

6

area (as in Figure 12.22b). The chromosomes displayed in Figure 12.22b have been prepared using a staining methodology in which chromosome preparations are incubated with multicolored, fluorescent DNA probes that bind specifically to particular chromosomes. By using various combinations of DNA probes and computer-aided visualization techniques, each chromosome can be “painted” with a different “virtual color,” making it readily identifiable to the trained eye. In addition to providing a colorful image, this technique delivers superb resolution, which allows clinical geneticists to discover chromosomal aberrations that might otherwise be missed (see Figure 2 of the accompanying Human Perspective). If the individual chromosomes are cut out of a photograph such as that of Figure 12.22b, they can be matched up into homologous pairs (23 in humans) and ordered according to decreasing size, as shown in Figure 12.22c. A preparation of this type is called a karyotype. The chromosomes shown in the karyotype of Figure 12.22c have been prepared using a staining procedure that gives the chromosomes a crossbanded appearance. The pattern of these bands is highly characteristic for each chromosome of a species and provides a basis to identify chromosomes and compare them from one species to the next (see Figure 3 in the Human Perspective). Karyotypes are routinely prepared from cultures of blood cells and used to screen individuals for chromosomal abnormalities. As discussed in the accompanying Human Perspective, extra, missing, or grossly altered chromosomes can be detected in this manner.

Boundary element

7

8

503

Capillary pipette with bulb

Drop of blood

Transfer to vial for culturing.

Culture medium (includes substance which stimulates mitosis in leukocytes) Culture approximately 72 hours, then add colchicine for 30 min to 3 hours. Collect cells by centrifugation. (b) Medium Cells

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

Cell suspension

Drops of fixative with cells Wet slide Evaporate fixative. Stain slide.

Slide Site containing the chromosomes released from a single nucleus (a)

Y X 22

(c)

Figure 12.22 Human mitotic chromosomes and karyotypes. (a) Procedure used to obtain preparations of mitotic chromosomes for microscopic observation from leukocytes in the peripheral blood. (b) Photograph of a cluster of mitotic chromosomes from a single dividing human cell. The DNA of each chromosome has hybridized to an assortment of DNA probes that are linked covalently to two or more fluorescent dyes. Different chromosomes bind different combinations of these dyes and, consequently, emit light of different wavelengths. Emission spectra from different chromosomes are converted into distinct and recognizable display colors by computer processing. Pairs of homologous chromosomes can be identified by searching for chromosomes of the same color and size. (c) The stained chromosomes of a human male arranged in a karyotype. Karyotypes showing paired homologues, arranged according to chromosome size, are prepared from a photograph of chromosomes released from a single nucleus. (B: FROM E. SCHRÖCK ET AL., COURTESY OF EVELIN SCHRÖCK AND THOMAS RIED, SCIENCE 273:495, 1996; © 1996, REPRINTED WITH PERMISSION FROM AAAS; C: CNRI/SCIENCE PHOTO LIBRARY/ PHOTO RESEARCHERS.)

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

Wash with fresh medium. Add hypotonic solution to cells. Let sit 10 min. Remove supernatant and add cold fixative (3:1 methanol:acetic acid). Let sit 30 min in cold, then disperse cells.

504

T H E

H U M A N

P E R S P E C T I V E

Chromosomal Aberrations and Human Disorders In addition to mutations that alter the information content of a single gene, chromosomes may be subjected to more extensive alterations that occur most commonly during cell division. Pieces of a chromosome may be lost or segments may be exchanged between different chromosomes. Because these chromosomal aberrations follow chromosomal breakage, their incidence is increased by exposure to agents that damage DNA, such as viral infection, X-rays, or reactive chemicals. In addition, the chromosomes of some individuals contain “fragile” sites that are particularly susceptible to breakage. Persons with certain rare inherited conditions, such as Bloom syndrome, Fanconi anemia, and ataxia-telangiectasia, have unstable chromosomes that exhibit a greatly increased tendency toward breakage. The consequences of a chromosomal aberration depend on the genes that are affected and the type of cell in which it occurs. If the aberration occurs in a somatic (nonreproductive) cell, the consequences are generally minimal because only a few cells of the body are usually affected. On rare occasions, however, a somatic cell carrying an aberration may be transformed into a malignant cell, which can grow into a cancerous tumor. Chromosomal aberrations that occur during meiosis—particularly as a result of abnormal crossing over—can be transmitted to the next generation. When an aberrant chromosome is inherited through a gamete, all cells of the embryo will have the aberration, which generally proves lethal during development. There are several types of chromosomal aberrations, including the following:

Chapter 12 Control of Gene Expression

■

■

Inversions. Sometimes a chromosome breaks in two places, and the segment between the breaks becomes resealed into the chromosome in reverse orientation. This aberration is called an inversion. More than 1 percent of humans carry an inversion that can be detected during chromosome karyotyping (see Figure 10.29b). A chromosome bearing an inversion usually contains all the genes of a normal chromosome, and therefore, the individual is not adversely affected. However, if a cell with a chromosome inversion enters meiosis, the aberrant chromosome cannot pair properly with its normal homologue because of differences in the order of their genes. In such cases, chromosome pairing is usually accompanied by a loop (Figure 1). If crossing over occurs within the loop, as shown in the figure, the gametes generated by meiosis either possess an additional copy of certain genes (a duplication) or are missing those genes (a deletion). When a gamete containing an altered chromosome fuses with a normal gamete at fertilization, the resulting zygote has a chromosome imbalance and is often nonviable. Translocations. When all or a piece of one chromosome becomes attached to another chromosome, the aberration is called a translocation (Figure 2). Like inversions, a translocation that occurs in a somatic cell generally has little effect on the functions of that cell or its progeny. However, certain translocations increase the likelihood that the cell will become malignant. The best studied example is the Philadelphia chromosome, which is found in the malignant cells (but not the normal cells) of individuals with certain forms of leukemia. The Philadelphia chromosome, which is named for the city in which it was discovered in 1960, is a shortened version of human chromosome number 22. For years, it was thought that the missing segment represented a simple deletion, but with improved techniques for observing chromosomes, the missing genetic piece was found translocated to another chromosome (number 9). Chromosome

Centromere C

B

D

A

C

B

A

D

First meiotic anaphase

A

C B D

A

D

C B

C

D

A B

C

B

A

D

Figure 1 The effect of inversion. Crossing over between a normal chromosome (purple) and one containing an inversion (orange) is usually accompanied by formation of a loop. The chromosomes that result from the crossover contain duplications and deficiencies, which are shown in the chromosomes at first meiotic division in the lower part of the figure.

number 9 contains a gene (ABL) that encodes a protein kinase that plays a role in cell proliferation. As a result of translocation, one small end of this protein is replaced by about 600 extra amino acids encoded by a gene (BCR) carried on the translocated piece of chromosome number 22. This novel “fusion protein” retains the catalytic activity of the original ABL but is no longer subject to the cell’s normal regulatory mechanisms. As a result, the affected cell becomes malignant and causes chronic myelogenous leukemia (CML). It has generally been assumed that translocations occur following the random breakage of chromosomal DNA. Recent studies, however, suggest that such breaks in the DNA may occur at sites during the normal process of transcription, an activity that may make the DNA more susceptible for breakage. Prostate cancer, for example, is characterized by translocations affecting a number of genes that are transcribed in normal prostate cells in response to male hormones (androgens). Like inversions, translocations cause problems during meiosis. A chromosome altered by translocation has a different genetic content from its homologue. As a result, the gametes formed by meiosis will either contain extra copies of genes or be missing genes. Translocations have been shown to play an important role in evolution, generating large-scale changes that may be

505 orangutans reveals a striking similarity. Close examination of the two ape chromosomes that have no counterpart in humans reveals that together they are equivalent, band for band, to human chromosome number 2 (Figure 3). At some point during the evolution of humans, an entire chromosome was apparently translocated to another, creating a single fused chromosome and reducing the haploid number from 24 to 23. ■

Figure 2 A translocation. Micrograph shows a set of human chromosomes in which chromosome 12 (bright blue) has exchanged pieces with chromosome 7 (red). The affected chromosomes have been made fluorescent by in situ hybridization with a large number of DNA fragments that are specific for each of the two chromosomes involved. The use of these “stains” makes it very evident when one chromosome has exchanged pieces with another chromosome. (COURTESY OF LAWRENCE LIVERMORE LABORATORY, FROM A TECHNIQUE DEVELOPED BY JOE GRAY AND DAN PINKEL.) ■

Human chromosome #2 Human

Chimpanzee

Gorilla

Orangutan

Telomeres In contrast to most prokaryotic chromosomes that consist of a circular molecule of DNA, each eukaryotic chromosome contains a single, continuous DNA molecule. The tips of each DNA molecule are composed of an unusual stretch of repeated sequences that, together with a group of specialized proteins, form a cap at each end of the chromosome called a telomere. Human telomeres contain the sequence TTAGGG repeated from about 500 to 5000 times AATCCC

Figure 3 Translocation and evolution. If the only two ape chromosomes that have no counterpart in humans are hypothetically fused, they match human chromosome number 2, band for band.

(Figure 12.23a). Unlike most repeated sequences that vary considerably from species to species, the same telomere sequence is found throughout the vertebrates, and similar sequences are found in most other organisms. This similarity in sequence suggests that telomeres have a conserved function in diverse organisms. As discussed in Chapter 13, the DNA polymerases that replicate DNA cannot initiate synthesis of a strand of DNA but can only add nucleotides to the 3 end of an existing strand.

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

pivotal in the branching of separate evolutionary lines from a common ancestor. Such a genetic incident probably happened during our own recent evolutionary history. A comparison of the 23 pairs of chromosomes in human cells with the 24 pairs of chromosomes in the cells of chimpanzees, gorillas, and

Deletions. A deletion occurs when a portion of a chromosome is missing. As noted above, zygotes containing a chromosomal deletion are produced when one of the gametes is the product of an abnormal meiosis. Forfeiting a portion of a chromosome often results in a loss of critical genes, producing severe consequences, even if the individual’s homologous chromosome is normal. Most human embryos that carry a significant deletion do not develop to term, and those that do exhibit a variety of malformations. The first correlation between a human disorder and a chromosomal deletion was made in 1963 by Jerome Lejeune, a French geneticist who had earlier discovered the chromosomal basis of Down syndrome. Lejeune discovered that a baby born with a variety of facial malformations was missing a portion of chromosome 5. A defect in the larynx (voice box) caused the infant’s cry to resemble the sound of a suffering cat. Consequently, the scientists named the disorder cri-du-chat syndrome, meaning cry-of-the-cat syndrome. Duplications. A duplication occurs when a portion of a chromosome is repeated. The role of duplications in the formation of multigene families was discussed on page 407. More substantial chromosome duplications create a condition in which a number of genes are present in three copies rather than the two copies normally present (a condition called partial trisomy). Cellular activities are very sensitive to the number of copies of genes, and thus extra copies of genes can have serious deleterious effects.

506

Chapter 12 Control of Gene Expression

(a)

(b)

Figure 12.23 Telomeres. (a) In situ hybridization of a DNA probe containing the sequence TTAGGG, which localizes to the telomeres of human chromosomes. (b) Demonstration that certain proteins bind specifically to telomeric DNA. These chromosomes were prepared from a meiotic nucleus of a yeast cell and incubated with the protein RAP1, which was subsequently localized at the telomeres by a fluorescent anti-RAP1 antibody. Blue areas indicate DNA staining, yellow areas represent anti-RAP1 antibody labeling, and red shows RNA

stained with propidium iodide. Humans have a homologous telomere protein (hRAP1 that is part of the shelterin complex). (A: FROM J. MEYNE, IN R. P. WAGNER, CHROMOSOMES: A SYNTHESIS, WILEY, 1993. © 1993. THIS MATERIAL IS USED BY PERMISSION OF JOHN WILEY & SONS, INC.; B: FROM FRANZ KLEIN ET AL., J. CELL BIOL. 117:940, 1992, FIG. 4C. COURTESY OF SUSAN M. GASSER; REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

Replication is initiated at the 5 end of each newly synthesized strand by synthesis of a short RNA primer (step 1, Figure 12.24a) that is subsequently removed (step 2). Because of this mechanism, and the additional processing steps shown in step 3, the 5 end of each strand is missing a short segment of DNA. As a result, the strand with the 3 end overhangs the strand with the 5 end. Rather than existing as an unprotected singlestranded terminus, the overhanging strand is “tucked back” into the double-stranded portion of the telomere to form a loop as illustrated in Figure 12.24b. This conformation is thought to protect the telomeric end of the DNA. A number of DNAbinding proteins have been identified that bind specifically to telomeres and are essential for telomere function. The protein that is bound to the chromosomes in Figure 12.23b plays a role in regulating telomere length in yeast. The telomeric DNA of animal cells binds a 6-subunit protein complex called shelterin. Among its functions, shelterin prevents a cell’s DNA repair machinery from mistaking the ends of the telomeric DNA as damaged or broken DNA strands, which would have serious consequences for the integrity of the genome. If cells were not able to replicate the ends of their DNA, the chromosomes would become shorter and shorter with each round of cell division (Figure 12.24a). This predicament has been called “the end-replication problem.” The primary mechanism by which organisms solve the “end-replication

problem,” came to light in 1984 when Elizabeth Blackburn and Carol Greider of the University of California, Berkeley, discovered a novel enzyme, called telomerase, that can add new repeat units to the 3 end of the overhanging strand (Figure 12.24c). Once the 3 end of the strand has been lengthened, a conventional DNA polymerase can use the newly synthesized 3 segment as a template to return the 5 end of the complementary strand to its previous length. Telomerase is a reverse transcriptase that synthesizes DNA using an RNA template. Unlike most reverse transcriptases, the enzyme itself contains the RNA that serves as its template (Figure 12.24c). Telomeres are very important parts of a chromosome: they are required for the complete replication of the chromosome; they form caps that protect the chromosomes from nucleases and other destabilizing influences; and they prevent the ends of chromosomes from fusing with one another. Figure 12.25 shows mitotic chromosomes from a mouse that has been genetically engineered to lack telomerase. Many of the chromosomes have undergone end-to-end fusion, which leads to catastrophic consequences as the chromosomes are torn apart in subsequent cell divisions. Recent experiments suggest additional roles for telomeres, which have made them a focus of current research. Suppose a researcher were to take a small biopsy of your skin, isolate a population of fibroblasts from the dermis, and allow these cells to grow in an enriched culture medium. The

5' 3' 1

Replication

5' 3' 5' 3'

507

3' 5'

3' 5' Telomerase

RNA

RNA

3' 5' 2

3' 5'

Removal of 5' RNA primer

1

5' 3'

3' 5' 5'

3' 5'

3' 3

2

3' 5'

3' 5'

U C U C A A AA CCCCAACCCCAACCC 5' U AACCCCAAC U GGGGTTGGGGTTGGGGTTGGGGTTGGGG 3'

Elongation

Processing of the 5' termini

5'

RNA

3'

3' 5'

U C U C A A AA CCCCAACCCCAACCC 5' U AACCCCAAC U GGGGTTGGGGTTGGGGTTGGGGTTGGGGTTG 3'

5'

3'

Translocation

(a) 3

Elongation 4

(c)

(b)

Figure 12.24 The end-replication problem and the role of telomerase. (a) When the DNA of a chromosome is replicated (step 1), the 5 ends of the newly synthesized strands (red) contain a short segment of RNA (green), which had functioned as a primer for synthesis of the adjoining DNA. Once this RNA is removed (step 2), the 5 end of the DNA becomes shorter relative to that of the previous generation. The 5 ends at each end of the chromosome are further processed by nuclease activities (step 3), which increases the lengths of the single-stranded overhangs. (b) The single-stranded overhang does not remain as a free extension, but invades the duplex as shown here, displacing one of the strands, which forms a loop. The loop is a binding site for a complex of specific telomere-capping proteins that protect the ends of the chromosomes and regulate telomere length. (c) The mechanism of action of telomerase. The enzyme contains an RNA molecule that is complementary to the end of the G-rich strand, which extends

3' 5'

U C U C A A AA CCCCAACCCCAACCC 5' U AACCCCAAC U GGGGTTGGGGTTGGGGTTGGGGTTGGGGTTGGGGTTG 3'

past the C-rich strand. The telomerase RNA binds to the protruding end of the G-rich strand (step 1) and then serves as a template for the addition of nucleotides onto the 3 terminus of the strand (step 2). After a segment of DNA is synthesized, the telomerase RNA slides to the new end of the strand being elongated (step 3) and serves as the template for the incorporation of additional nucleotides (step 4). The gap in the complementary strand is filled by the replication enzymes polymerase -primase (see Figure 13.21). (The TTGGGG sequence depicted in this drawing is that of the ciliated protist Tetrahymena, in which telomerase was discovered.) (C: C. W. GREIDER AND E. H. BLACKBURN, REPRINTED WITH PERMISSION FROM NATURE 337:336, 1989, COPYRIGHT 1989. NATURE BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

Figure 12.25 Telomerase and chromosome integrity. The chromosomes in this micrograph are from the cell of a mouse that lacks a functional gene for the enzyme telomerase. The telomeres appear as yellow spots following in situ hybridization with a fluorescently labeled telomere probe. Some chromosomes lack telomeres entirely and a number of chromosomes have fused to one another at their ends. Chromosome fusion produces chromosomes with more than one centromere, which in turn leads to chromosome breakage during cell division. The genetic instability resulting from loss of a telomere may be a major cause of cells becoming cancerous. (FROM MARIA A. BLASCO ET AL., COURTESY OF CAROL W. GREIDER, CELL, VOL. 91, COVER #1, 1997, WITH PERMISSION FROM ELSEVIER.)

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

Overhanging strand

5'

3' 5'

U C U C A A AA CCCCAACCCCAACCC 5' U AACCCCAAC U GGGGTTGGGGTTGGGGTTGGGGTTGGGGTTG 3'

508

fibroblasts would divide every day or so in culture and eventually cover the dish. If a fraction of these cells were removed from the first dish and replated on a second dish, they would once again proliferate and cover the second dish. You might think you could subculture these cells indefinitely—as was thought in the first half of the last century—but you would be wrong. As discovered in 1961 by Leonard Hayflick and Paul Moorhead of the Wistar Institute in Philadelphia, the cells stop dividing entirely after about 50 to 80 population doublings and enter a stage referred to as replicative senescence. If the average length of the telomeres in the fibroblasts at the beginning and end of the experiment are compared, one finds a dramatic decrease in telomere length over time in culture. Telomeres shrink because most cells lack telomerase and are unable to prevent the loss of their chromosome ends. With each cell division, the telomeres of the chromosomes become shorter and shorter. Telomere shortening continues to a critical point at which a physiological response is triggered within each cell that causes that cell to cease continued growth and division (Figure 12.26). As discussed in the figure legend, only cells that are able to resume expression of telomerase continue to proliferate indefinitely. Not only do the telomerase-expressing cells continue to divide, they do so without showing signs of the normal aging processes seen in control cultures.

Germ cells

Telomere length

Longest

So

m

at

ic

ce

lls

Immortalized cells (expressing telomerase) Replicative senescence (Hayflick limit) Crisis

Shortest

Chapter 12 Control of Gene Expression

Population doublings

Figure 12.26 Telomere dynamics during normal and abnormal growth. Germ cells express telomerase and maintain long telomeres. In contrast, most other cells express little to no telomerase; as their telomeres shorten, cell proliferation ceases, and they enter a state of replicative senescence, which is thought to contribute to normal aging. Cells in this state can remain alive for long periods in culture and, if stimulated in certain ways—by treatment with certain viral proteins or by eliminating the growth inhibitor p21—can regain their ability to grow and divide for an extended period. Eventually, however, the telomeres in these cells dramatically shorten, which causes genome instability and cell death. This second period of growth arrest is known as crisis. Cells that are able to reactivate telomerase can exit crisis and become immortal (and typically capable of causing cancer, page 751). (ADAPTED FROM OSTERHAGE AND FRIEDMAN, JBC 284: 16061, COPYRIGHT 2009. JOURNAL OF BIOLOGICAL CHEMISTRY BY AMERICAN SOCIETY FOR BIOCHEMISTRY & MOLECULAR BIOL. REPRODUCED WITH PERMISSION OF AMERICAN SOCIETY FOR BIOCHEMISTRY AND MOLECULAR B1 IN THE FORMAT REPRINT IN A BOOK VIA COPYRIGHT CLEARANCE CENTER.)

Interestingly, telomerase is absent from most of the body’s cells, but there are notable exceptions. The germ cells of the gonads retain telomerase activity, and the telomeres of their chromosomes do not shrink as the result of cell division (Figure 12.26). Consequently, each offspring begins life as a zygote that contains telomeres of maximal length. Similarly, the stem cells located in the lining of the skin and intestine and the hematopoietic stem cells of the blood-forming tissues also express this enzyme, which allows these cells to continue to proliferate and generate the huge numbers of differentiated cells required in these organs. The importance of telomerase is revealed by a rare condition in which individuals have greatly reduced telomerase levels, because they are heterozygous for the gene that encodes either the telomerase RNA or one of the protein subunits. These individuals suffer from bone marrow failure due to the inability of this blood-forming tissue to produce a sufficient number of blood cells over a normal human lifetime. If telomeres are such an important factor in limiting the number of times that a cell can divide, one might expect telomere shortening to be a major factor in normal human aging. A large body of research supports this proposal. In fact, some biologists describe telomere length as a type of “longevity clock,” providing a measure of how rapidly a particular individual is aging biologically, rather than just chronologically. At the time of this writing, at least two companies (both founded by prominent telomere researchers) are offering to measure the lengths of telomeres in a sample of a person’s blood cells. Other researchers argue that such tests will fail to provide a valid measure of an individual’s risk for age-related diseases. Even if a person with exceptionally long telomeres were likely to age more slowly than the average person, there is another aspect of the issue to consider. A large body of evidence indicates that telomere shortening plays a key role in protecting humans from cancer by limiting the number of divisions of a potential tumor cell. Whether or not a person with longer telomeres is more susceptible to the development of a serious malignancy remains to be determined. Regardless, the relationship between cancer growth and telomeres is a major focus of study. Malignant cells, by definition, are cells that have escaped the body’s normal growth controls and continue to divide indefinitely. How is it that malignant tumor cells can divide repeatedly without running out of telomeres and bringing on their own death? Unlike normal cells that lack detectable telomerase activity, approximately 90 percent of human tumors consist of cells that contain an active telomerase enzyme.4 According to one scenario, the growth of tumors is accompanied by intense selection for cells in which the expression of telomerase has been reactivated. The vast majority of tumor cells fail to express telomerase and die out, whereas the rare cells that express the enzyme are “immortalized.” This does not mean that activation of telomerase, by itself, causes cells to become malignant. As discussed in Chapter 16, the development of cancer is a multistep process in which cells typically develop abnormal chromosomes, changes in cell adhesion, and the ability to invade normal 4

The other 10 percent or so have an alternate mechanism based on genetic recombination that maintains telomere length in the absence of telomerase.

509

tissues. Telomere maintenance and unlimited cell division is thus only one property of cancer cells. It is interesting to note that the initial discoveries of telomeric DNA sequences and telomerase were carried out on Tetrahymena, the same singlecelled pond creature exploited in the discovery of ribozymes (page 477). This point serves as another reminder that one never knows which experimental pathways will lead to discoveries that have great medical significance.

C

Figure 12.27 Each mitotic chromosome has a centromere whose site is marked by a distinct indentation. Scanning electron micrograph of a mitotic chromosome. The centromere (C) contains highly repetitive DNA sequences (satellite DNA) and a protein-containing structure called the kinetochore that serves as a site for the attachment of spindle microtubules during mitosis and meiosis (discussed in Chapter 14). (FROM JEROME B. RATTNER, BIOESS. 13:51, 1991.) © 1991. THIS MATERIAL IS USED BY PERMISSION OF JOHN WILEY & SONS.

Epigenetics: There’s More to Inheritance than DNA As described in the previous paragraph, -satellite DNA is not required for the development of a centromere. In fact, dozens of unrelated DNA sequences have been found at the centromeres of marker chromosomes. It is not the DNA that indelibly marks the site as a centromere but the CENPA-containing chromatin that is localized there. These findings raise a larger issue. Not all inherited traits are dependent on DNA sequences. Inheritance that is not encoded in DNA is referred to as epigenetic as opposed to genetic. The inactivation of the X chromosome discussed on page 498 is another example of an epigenetic phenomenon: the two X chromosomes can have identical DNA sequences, but one is inactivated and the other is not. Furthermore, the state of inactivation is transmitted from each cell to its daughters throughout the life of the person. However, unlike genetic inheritance, an epigenetic state can usually be reversed; X chromosomes, for example, are reactivated prior to formation of gametes. Biologists have struggled to understand (1) the mechanisms by which epigenetic information is stored and (2) the mechanisms by which an epigenetic state can be transmitted from cell to cell and from parent to offspring. In this discussion, we will focus primarily on one type of epigenetic phenomenon: the state of a cell’s gene activity. Consider a stem cell residing at the base of the epidermis (as in Figure 7.1). Certain genes in these cells are transcriptionally active and others are repressed, and it is important that this characteristic pattern of gene activity is transmitted from one cell to its daughters. However, not all of the daughter cells continue life as stem cells; some of them take on a new commitment and begin the process of differentiation into mature epidermal cells. This step requires a change in the transcriptional state of that cell. Recent attention has focused on the histone code (page 499) as a critical factor in both the determination of the transcriptional state of a particular region of chromatin and its transmission to subsequent cellular generations. When the DNA of a cell is replicated, histones associated with the DNA as part of the nucleosomes are distributed randomly to the daughter cells along with the DNA molecules. As a result, each daughter DNA strand receives roughly half of the core histones that were associated with the parental strand (see Figure 13.23). The other half of the core histones that become associated with the daughter DNA strands are recruited from a pool of newly synthesized histone molecules. The modifications present on the histone tails in the parental

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

Centromeres Each chromosome depicted in Figure 12.22 contains a site where the outer surfaces are markedly indented. The site of the constriction marks the centromere of the chromosome (Figure 12.27). In humans the centromere contains a tandemly repeated, 171-base-pair DNA sequence (called -satellite DNA) that extends for at least 500 kilobases. This stretch of DNA associates with specific proteins that distinguish it from other parts of the chromosome. For instance, centromeric chromatin contains a unique H3 histone variant, called CENP-A in mammals, which replaces conventional H3 in a certain fraction of the centromeric nucleosomes and gives these nucleosomes unique properties. During the formation of mitotic chromosomes, the CENP-A-containing nucleosomes become situated on the outer surface of the centromere where they serve as the platform for the assembly of the kinetochore. The kinetochore, in turn, serves as the attachment site for the microtubules that separate the chromosomes during cell division (see Figure 14.16). Chromosomes lacking CENP-A fail to assemble a kinetochore and are lost during cell division. It has been suggested in previous chapters that DNA sequences responsible for essential cellular functions tend to be conserved. It came as a surprise, therefore, to discover that centromeric DNA exhibited marked differences in nucleotide sequence, even among closely related species. This finding suggests that the DNA sequence itself is not an important determinant of centromere structure and function, a conclusion that is strongly supported by the following studies on humans. Approximately 1 in every 2000 humans is born with cells that have an excess piece of chromosomal DNA that forms an additional diminutive chromosome, called a marker chromosome. In some cases, marker chromosomes are devoid of -satellite

DNA, yet they still contain a primary constriction and a fully functional centromere that allows the duplicated marker chromosomes to be separated normally into daughter cells at each division. Clearly, some other DNA sequence in these marker chromosomes becomes “selected” as the site to contain CENP-A and other centromeric proteins. The centromere appears at the same site on a marker chromosome in all of the person’s cells, indicating that the property is transmitted to the daughter chromosomes during cell division. In one study, marker chromosomes were found to be transmitted stably through three generations of family members.

510

chromatin are thought to determine the modifications that will be introduced in the newly synthesized histones in the daughter chromatin. We saw on page 501, for example, that heterochromatic regions of chromatin have methylated lysine residues at position 9 of histone H3. The enzyme responsible for this methylation reaction is present as one of the components of heterochromatin. As heterochromatin is replicated, this histone methyltransferase is thought to methylate the newly synthesized H3 molecules that become incorporated into the daughter nucleosomes. In this way, the H3 methylation pattern of the chromatin, and thus its condensed heterochromatic state, is transmitted from the parent cell to its offspring. In contrast, euchromatic regions tend to contain acetylated H3 tails, and this modification is also transmitted from parental chromatin to progeny chromatin, which may serve as the epigenetic mechanism by which active euchromatic regions are perpetuated in daughter cells. Histone modifications represent one carrier of epigenetic information, covalent modifications to the DNA are another type. This latter subject is taken up on page 531. Inappropriate changes in epigenetic state are associated with numerous diseases. There is also evidence to suggest that differences in physical appearance, disease susceptibility, and longevity between genetically identical twins may be due, in part, to epigenetic differences that appear between the twins as they age. Some of these differences in the epigenomes of identical twins are presumed to be caused by differences in environmental conditions that the individuals have experienced, and others are simply a result of random differences that are introduced during the many cell divisions that occur during a human lifetime.

(a)

(b)

Figure 12.28 Chromosome territories (a) Three and (b) all 23 pairs of human chromosomes were detected using fluorescence in situ hybridization analysis similar to that described for Figure 12.22b that allows each human chromosome to be distinguished from others and represented by an identifiable color. Each chromosome is found to occupy a distinct territory within the nucleus. (FROM MICHAEL R. HUBNER AND DAVID L. SPECTOR, (A) COURTESY OF IRINA SOLOVEI (B) COURTESY OF IRINA SOLOVEI AND ANDREAS BOLZER, ANNUAL REVIEW OF BIOPHYSICS, VOLUME 39; 471, FIG. 1, 2010. REPRINTED WITH PERMISSION OF ANNUAL REVIEWS, INC.)

may be related to the levels of activity of these two chromosomes: chromosome number 18 is relatively devoid of genes, whereas chromosome number 19 is rich in protein-coding sequences, many of which are presumably transcribed in these cells. A similar disposition is observed in the nuclei of women, where the inactive X chromosome is located at one edge of the

Chapter 12 Control of Gene Expression

The Nucleus as an Organized Organelle Examination of the cytoplasm of a eukaryotic cell under the electron microscope reveals the presence of a diverse array of membranous organelles and cytoskeletal elements. Examination of the nucleus, on the other hand, typically reveals little more than scattered clumps of chromatin and one or more irregular nucleoli. As a result, researchers were left with the impression that the nucleus is largely a “sack” of randomly positioned components. With the development of new microscopic techniques, including fluorescence in situ hybridization (FISH, page 402) and imaging of live GFP-labeled cells (page 273), it became possible to localize specific gene loci within the interphase nucleus. It became evident from these studies that the nucleus maintains considerable order. For example, the chromatin fibers of a given interphase chromosome are not strewn through the nucleus like a bowl of spaghetti, but are concentrated into a distinct territory that does not overlap extensively with the territories of other chromosomes (Figure 12.28). The localization within the nucleus of a given chromosome territory may not be entirely random. In the micrograph shown in Figure 12.29, the chromatin that comprises human chromosome number 18 (shown in green) occupies a territory near the periphery of the nucleus, whereas the chromatin of chromosome number 19 (shown in red) is localized more centrally within the organelle. This difference in nuclear location

Figure 12.29 Localizing specific chromosomes within an interphase nucleus. Micrograph of the nucleus of a human lymphocyte (stained blue) that was subjected to dual-label fluorescence in situ hybridization to visualize chromosomes numbers 18 and 19, which appear as green and red, respectively. Chromosome 19, which contains a greater density of protein-coding genes than chromosome 18, tends to be more centrally located within the cell’s nucleus. (FROM JENNY A. CROFT ET AL., COURTESY OF WENDY A. BICKMORE, J. CELL BIOL. 145:1119, 1999, FIG. 1. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

511

(a)

(b) Transcription factories Nucleus C A

D

B

Chromosome 2 territory Chromosome 1 territory

(c)

Figure 12.30 Interactions can occur between distantly located genes in response to physiological stimuli. (a) Two cells derived from a breast cancer line that have not been treated with hormone. The chromosome 2 and 21 territories, which have been fluorescently labeled in each cell, are positioned independently of one another within the nucleus. (b) Two of the same types of breast cancer cells visualized 60 minutes after estrogen treatment. Interactions between the chromosome 2 and 21 territories are now evident. Close examination shows the colocalization of the TFF1 locus (in green) on chromosome 21 and the GREB1 locus on chromosome 2 (in red). In this study, approximately half of the hormone-treated cells exhibited biallelic interactions (i.e., both TFF1 alleles interacting with GREB1 alleles) as depicted here. (c) A drawing illustrating how genes on different regions of the same chromosome (A and B), or on different chromosomes (C and D), can come together within the nucleus. In some cases, DNA sequences from distant loci may influence one another’s transcription activity, and in other cases these distant genes may simply share a common pool of proteins involved in the transcription process. (A, B: FROM QIDONG HU ET AL., COURTESY OF MICHAEL G. ROSENFELD, PROC. NAT ’L ACAD. SCI. U.S.A. 105: 19200, 2008, FIG. 2B AND 2C. © 2008, NATIONAL ACADEMY OF SCIENCES, U.S.A.)

uniformly throughout the nucleus, the processing machinery is concentrated within 20 to 50 irregular domains, referred to as “speckles.” According to current opinion, these speckles function as dynamic storage depots that supply splicing fac-

12.2 Control of Gene Expression in Eukaryotes: Structure and Function of the Cell Nucleus

nucleus and the active X chromosome is situated internally (Figure 12.17a). Even though genes are typically transcribed while they reside within their territories, individual chromatin fibers can extend away from these territories for considerable distances. Furthermore, DNA sequences that participate in a common biological response but reside on different chromosomes can apparently come together within the nucleus where they can influence gene transcription. Interactions between different loci have been revealed by the invention of a variety of techniques that involve “chromosome conformation capture (3C).” In this approach, cells are fixed by treatment with formaldehyde, which causes DNA sequences residing in close proximity within the nucleus to become covalently crosslinked to one another (they are said to be “captured”). After the fixation process, the DNA is isolated and subjected to digestion by restriction enzymes (Section 18.12), and the digestion products are analyzed to determine which DNA sequences in the genome interact with a given DNA sequence (the “bait” sequence) at the time of fixation. For example, a researcher might want to know which DNA sequences throughout the genome interact with the -globin locus during the differentiation of erythrocytes in the bone marrow. DNA sequences that are found to interact using this technique are typically sequences that are present on the same chromosome. This is what you would expect, for example, between an enhancer and a promoter for the same gene, which come into close proximity as depicted in Figure 12.48. But numerous examples have also been found where the interacting DNA sequences are present on different chromosomes. A good example of such interchromosomal interactions comes from a study in which human cultured breast cells (both normal and malignant versions) were treated with the hormone estrogen (Figure 12.30). Estrogen induces the transcription of a large number of target genes through its binding to an estrogen receptor (ER). Two of estrogen’s target genes in humans are GREB1, located on chromosome 2, and TFF1, located on chromosome 21. Prior to treating the cells with estrogen, the chromosome 2 and 21 territories are at distant locations from one another, and the GREB1 and TFF1 gene loci are tucked away within the interior of their respective chromosome territories (Figure 12.30a.). However, within minutes after the cells are exposed to estrogen, these two chromosome territories are repositioned into close physical proximity to one another, and the two gene loci become colocalized on the periphery of their territories (Figure 12.30b). These studies support the idea that genes are physically moved to sites within the nucleus called transcription factories, where the transcription machinery is concentrated, and that genes involved in the same response tend to become colocalized in the same factory (Figure 12.30c). A number of studies suggest that the movement of chromosomal loci within the nucleus is driven by actin–myosin interactions. Another example of the interrelationship between nuclear organization and gene expression is illustrated in Figure 12.31a. This micrograph shows a cell that has been stained with a fluorescent antibody against one of the protein factors involved in pre-mRNA splicing. Rather than being spread

512

tors for use at nearby sites of transcription. The green dot in the nucleus in Figure 12.31a is a viral gene that is being transcribed near one of the speckles. The micrographs of Figure 12.31b show a trail of splicing factors extending from a speckle domain toward a nearby site where pre-mRNA synthesis has recently been activated. The various structures of the nucleus, including nucleoli and speckles, are dynamic, steady-state compartments with their component parts constantly moving in and out of the structures. If that activity is blocked, the compartment simply disappears as its materials are dispersed into the nucleoplasm.

In addition to nucleoli and speckles, several other types of nuclear bodies (e.g., Cajal bodies, GEMs, and PML bodies) are often seen under the microscope. Each of these nuclear bodies contains large numbers of proteins that move in and out of the structure in a dynamic manner. Because none of these nuclear bodies are enclosed by a membrane, no special transport mechanisms are required for these large-scale movements. Various functions have been attributed to these nuclear structures, but they remain poorly defined. Moreover, they appear not to be essential for cell viability and will not be discussed further.

REVIEW

(a) BKV-IE RNA

0'

20'

60'

(b)

Chapter 12 Control of Gene Expression

1. Describe the components that make up the nuclear envelope. What is the relationship between the nuclear membranes and the nuclear pore complex? How does the nuclear pore complex regulate bidirectional movement of materials between the nucleus and cytoplasm? 2. What is the relationship between the histones and DNA of a nucleosome core particle? How was the existence of nucleosomes first revealed? How are nucleosomes organized into higher levels of chromatin? 3. What is the difference in structure and function between heterochromatin and euchromatin? Between constitutive and facultative heterochromatin? Between an active and inactivated X chromosome in a female mammalian cell? How is the histone code a determinant of the state of a chromatin region? 4. What is the difference in structure and function between the centromeres and telomeres of a chromosome? 5. Describe some of the observations that suggest that the nucleus is an ordered compartment.

Figure 12.31 Nuclear compartmentalization of the cell’s mRNA processing machinery. (a) The nucleus of a cell stained with fluorescent antibodies against one of the factors involved in processing pre-mRNAs. The mRNA processing machinery is localized to approximately 30 to 50 discrete sites, or “speckles.” The cell shown in this micrograph had been infected with cytomegalovirus, whose genes (shown as a green dot) are being transcribed near one of these domains. (b) Cultured cells were transfected with a virus, and transcription was activated by addition of cyclic AMP. Images are seen at various times after transcriptional activation. The site of transcription of the viral genome in this cell is indicated by the arrows. This site was revealed at the end of the experiment by hybridizing the viral RNA to a fluorescently labeled probe (indicated by the white arrow in the fourth frame). PremRNA splicing factors (orange) form a trail from existing speckles in the direction of the genes being transcribed. (A: FROM TOM MISTELI AND DAVID L. SPECTOR, CURR. OPIN. CELL BIOL. 10:324, 1998, WITH PERMISSION FROM ELSEVIER; B: FROM TOM MISTELI, JAVIER F. CÁCERES, AND DAVID L. SPECTOR, NATURE 387:525, 1997, REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

12.3 | An Overview of Gene Regulation in Eukaryotes In addition to possessing a genome containing tens of thousands of genes, complex plants and animals are composed of many different types of cells. Vertebrates, for example, are composed of hundreds of different cell types, each far more complex than a bacterial cell and each requiring a distinct battery of proteins and RNAs that enable it to carry out specialized activities. Researchers struggled for many years to demonstrate that every cell in a complex multicellular organism contained all of the genes required to become any other cell in that organism. Convincing evidence for this widely held hypothesis was finally obtained in the late 1950s by two investigators, one working on plants and the other on animals. One of these key experiments was carried out by Frederick Steward and his colleagues at Cornell University, who demonstrated that a root cell isolated from a mature plant could be induced to grow into a fully developed plant that contained all the various

513

Sh

ee p

Sh

ee p of strain A

of strain B

Prepare cell culture from tissue of mammary gland

Egg Cell culture (primarily epithelial cells)

Remove egg chromosomes

Reduce serum level in culture to arrest growth and division Fuse egg with cultured cell, and activate

Enucleated egg Cells in arrested (G0) phase

Sh

ee p

of strain B

Figure 12.32 The cloning of animals demonstrates that nuclei retain a complete complement of genetic information. In this experiment, an enucleated egg from a sheep of one breed was fused with a mammary gland cell from a female of another breed. The activated egg developed into a healthy lamb. Because all of the genes in the newborn lamb had to have been derived from the mammary cell nucleus (which was demonstrated by use of genetic markers), this experiment confirms the widely held belief that differentiated cells retain all of the genetic information originally present in the zygote. [The primary difficulty that is generally encountered in nuclear transplantation experiments occurs when the nucleus from an active somatic (nongerm) cell is suddenly plunged into the cytoplasm of a relatively inactive egg. To avoid damaging the donor nucleus, the cultured cells were forced into a quiescent state (called G0) by drastically reducing the content of serum in the culture medium.]

two hemoglobin polypeptides represent less than onemillionth of the developing cell’s total DNA. Not only does the cell have to find this genetic needle in the chromosomal haystack, it has to regulate its expression to such a high degree that production of these few polypeptides becomes the dominant synthetic activity of the cell. Because the chain of events leading to the synthesis of a particular protein includes a number of discrete steps, there are several levels at which control might be exercised. We will discuss regulation of gene expres-

12.3 An Overview of Gene Regulation in Eukaryotes

cell types normally present. In the second experiment, John Gurdon of Oxford University demonstrated that a nucleus could be removed from a differentiated intestinal cell of a Xenopus tadpole, transplanted into an egg whose own nucleus had been destroyed, and that the recipient egg containing the donor nucleus could develop into a normal adult amphibian. All of the nuclei present in this cloned animal were shown to be derived from the nucleus taken from the somatic cell. In another, more recent landmark experiment, Ian Wilmot’s group in Scotland was able to clone a sheep (Dolly) by transplanting nuclei derived from cultured mammary gland cells into unfertilized eggs whose chromosomes had been removed (Figure 12.32). These experiments established that the entire set of genetic instructions are present in adult cells and can support the development of an entire organism under the right conditions. Since the birth of Dolly, researchers have successfully cloned over a dozen other mammals, including mice, cattle, goats, pigs, rabbits, and cats. Thus, it is not the presence or absence of genes in a cell that determines the properties of that cell, but how those genes are utilized. Those cells that become liver cells, for example, do so because they express a specific set of “liver genes,” while at the same time repressing those genes whose products are not involved in liver function. The success of the cloning experiments described above led to the conclusion that the transcriptional state of a differentiated cell, which in turn is dependent on the epigenetic state of its chromatin, is not irreversible. During a cloning experiment, the nucleus of a differentiated cell is able to stop expressing the genes of the adult tissue from which it is taken and begin to selectively express the genes that are appropriate for the activated egg in which it suddenly finds itself. We can conclude from these cloning experiments that a nucleus from a differentiated cell can be reprogrammed by factors that reside in the cytoplasm of its new environment. The subject of selective gene expression, which resides at the heart of molecular biology, will occupy us for the remainder of this chapter. The average bacterial cell contains enough DNA to encode approximately 3000 polypeptides, of which about onethird are typically expressed at any particular time. Compare this to a human cell that contains enough DNA (6 billion base pairs) to encode several million different polypeptides. Even though the vast majority of this DNA does not actually contain protein-coding information, it is estimated that a typical mammalian cell manufactures at least 5000 different polypeptides at any given time. Many of these polypeptides, such as the enzymes of glycolysis and the electron carriers of the respiratory chain, are synthesized by virtually all the cells of the body. At the same time, each cell type synthesizes proteins that are unique to that differentiated state. It is these proteins, more than any other components, that give each cell type its unique characteristics. Consider the situation that faces a developing red blood cell inside the marrow of a human bone. Of all the hundreds of different types of cells in the human body, only those in the lineage leading to red blood cells produce the protein hemoglobin. Moreover, hemoglobin accounts for more than 95 percent of a red blood cell’s protein, yet the genes that encode the

514

sion in eukaryotic cells at four distinct levels, as illustrated in the overview of Figure 12.33: 1. Transcriptional control mechanisms determine whether a particular gene can be transcribed and, if so, how often. 2. Processing control mechanisms determine the path by which the primary mRNA transcript (pre-mRNA) is processed into a messenger RNA that can be translated into a polypeptide. 3. Translational control mechanisms determine whether a particular mRNA is actually translated and, if so, how often and for how long a period. 4. Posttranslational control mechanisms regulate the activity and stability of proteins. In the following sections of this chapter, we will consider each of these regulatory strategies in turn.

Gene 1

Gene 2

TranscriptionalLevel control

As in prokaryotic cells, differential gene transcription is a key mechanism by which eukaryotic cells activate or repress gene expression. Different genes are expressed by cells at different stages of embryonic development, by cells in different tissues, and by cells that are exposed to different types of stimuli. An example of tissue-specific gene expression is shown in Figure 12.34. In this case, a gene encoding a muscle-specific protein is being transcribed in the cells of a mouse embryo that will give rise to muscle tissue. Every so often a new technology is invented that fundamentally changes the way certain types of problems in biology are addressed. Two such technologies are DNA microarrays (or “DNA chips”) and massively parallel sequencing (RNASeq). These techniques allow simultaneous profiling of the activity of all the genes in an organism or cell type, as opposed to earlier techniques that were mostly restricted to analyzing just one or a few genes at a time. In both approaches, RNA is first isolated from cells, tissues, or whole organisms; the two techniques then diverge according to the means by which geneexpression patterns are analyzed. First, we will examine the use

Gene 3

Nascent RNAs

Exon 1

Exon 2

Intron

Exon 3

Intron

1 2 3

ProcessingLevel control

TranslationalLevel control

12.4 | Transcriptional Control

or

Exon 4

Primary transcript

Intron 1 2 4

Inhibitor

Nascent polypeptides

Chapter 12 Control of Gene Expression

PosttranslationalLevel control

Proteasome

t½ Polypeptide

Fragments

Figure 12.33 Eukaryotic gene regulation. Transcriptional-level controls operate by determining which genes are transcribed and how often; processing-level controls operate by determining which parts of the primary transcripts become part of the pool of cellular mRNAs; translational-level controls regulate whether or not a particular mRNA is translated and, if so, how often and for how long, and posttranslational-level controls determine the longevity of specific proteins.

Figure 12.34 Experimental demonstration of tissue-specific gene expression. Transcription of the myogenin gene is activated specifically in those parts of this 11.5-day-old mouse embryo (the myotome of the somites) that will give rise to muscle tissue. The photograph shows a transgenic mouse embryo that contains the regulatory region of the myogenin gene placed upstream from a bacterial ␤-galactosidase gene, which acts as a reporter. The ␤-galactosidase gene is commonly employed to monitor the tissue-specific expression of a gene because the presence of the enzyme ␤-galactosidase is easily revealed by the blue color produced in a simple histochemical test. The activation of transcription, which results from transcription factors binding to the regulatory region of the myogenin gene, has occurred in the bluestained cells. (FROM T. C. CHENG ET AL., COURTESY OF ERIC N. OLSON, J. CELL BIOL. 119:1652, 1992, REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

515

of a microarray to identify genes that are expressed differently between yeast strains grown either 1) in the presence of glucose or 2) as the glucose is depleted and the yeast begin to utilize ethanol as a carbon source (Figure 12.35). Part a of this figure provides an outline of the basic steps taken in this type of experiment; these steps can be described briefly as follows: 1. DNA fragments representing individual genes to be stud-

ied are generated using techniques discussed in Chapter 18. In the case depicted in step a at the bottom of Figure 12.35a, each well on the plate holds a small volume containing a specific, cloned yeast gene. Different wells contain different genes. The cloned DNAs are spotted, one at a time, in an ordered array on a glass slide by an automated instrument that delivers a few nanoliters of a concentrated DNA solution onto a specific spot on the slide (step b). A completed DNA microarray is shown in step c. Using this technique, DNA fragments from thousands of different genes or even entire genomes can be spotted at known locations on a glass surface. 2. Meanwhile, the mRNAs present in the cells being studied are purified (step 1, Figure 12.35) and converted to a population of fluorescently labeled, complementary DNAs (cDNAs) (step 2). (The method for preparation of cDNAs is described in Figure 18.45). In the example illustrated in Figure 12.35, and discussed below, cDNAs are prepared from two different cell populations, one labeled with a fluorescent green dye and the other with a fluorescent red dye. The two preparations of labeled cDNAs are mixed (step 3) and then incubated with the slide containing the immobilized DNA (step 4). 3. Those DNAs in the microarray that have hybridized to a labeled cDNA are identified by examination of the slide under a fluorescence microscope (step 5). Any spot in the microarray that exhibits fluorescence represents a gene that has been transcribed in the cells being studied.

5

Differences in mRNA abundance likely reflect differences in mRNA stability (page 538) as well as differences in rates of transcription. Consequently, results obtained from DNA microarrays cannot be interpreted solely on the basis of transcriptional-level control.

12.4 Transcriptional Control

Figure 12.35b shows the actual results of this experiment. Each spot on the microarray of Figure 12.35b contains an immobilized DNA fragment from a different gene in the yeast genome. Taken together, the spots contain DNA from all of the roughly 6200 protein-coding genes present in a yeast cell. As discussed above, this particular DNA microarray was hybridized with a mixture of two different cDNA populations. One population of cDNAs, which was labeled with a green fluorescent dye, was prepared from the mRNAs of yeast cells grown in the presence of a high concentration of glucose. Unlike most cells, when baker’s yeast cells are grown in the presence of glucose and oxygen, they obtain their energy by glycolysis, rapidly converting the glucose into ethanol (Section 3.3). The other population of cDNAs, which was labeled with a red fluorescent dye, was prepared from the mRNAs of yeast cells that were growing aerobically in a medium rich in ethanol but lacking glucose. Cells growing under these conditions obtain their energy by oxidative phosphorylation, which requires the enzymes of the TCA cycle (page 185). The two cDNA populations were mixed, hybridized to the DNA probes on the microarray, and the slide examined with a fluo-

rescence microscope. Genes that are active in one or the other growth media appear as either green or red spots in the microarray (step 5, Figure 12.35a,b). Those spots that remain devoid of color in Figure 12.35 represent genes that are not transcribed in these cells in either growth medium, whereas those spots that have a yellow fluorescence represent genes that are transcribed in cells in both types of media. A close-up of a small portion of the microarray is shown in the inset. The experimental results shown in Figure 12.35b identify those genes transcribed in yeast cells under two different growth conditions. But just as importantly, they also provide information about the abundance of individual mRNAs in the cells, which is proportional to the intensity of a spot’s fluorescence. A spot that displays a very strong green fluorescence, for example, represents a gene whose transcripts are abundant in yeast cells grown in glucose but is repressed in yeast cells grown in ethanol. Concentrations of individual mRNAs can vary by more than a 100-fold in yeast cells. The technique is so sensitive that an mRNA can be detected at a level of less than one copy per cell.5 Figure 12.35c shows the changes in the concentration of glucose and ethanol over the course of an experiment. The glucose is rapidly metabolized by the yeast cells, leading to the disappearance of the sugar within a few hours. The ethanol produced by glucose fermentation is then gradually metabolized by the yeast cells over the next five days until it eventually disappears from the medium. Figure 12.35d shows the changes in the level of expression of the genes encoding the enzymes of the TCA cycle during the course of this experiment. The levels of expression of each gene (labeled on the right) are shown in intervals of time (labeled at the top), using shades of red to represent increases in expression, and shades of green to represent decreases in expression. It is evident that transcription of genes encoding TCA enzymes is stimulated when the cells adapt to growth on a carbon source (ethanol) that is metabolized by aerobic respiration, and then repressed when the ethanol is exhausted. DNA microarrays are currently being employed to study changes in gene expression that occur during a wide variety of biological events, such as cell division and the transformation of a normal cell into a malignant cell. It is even possible to study the diversity of RNAs being produced by a single tumor cell, once the cDNAs are amplified by PCR. Now that a number of eukaryotic genomes have been sequenced, researchers have an unlimited variety of genes whose expression can be monitored under different conditions. More recently, a new technology that depends on sequencing of small fragments of either genomic DNA or cDNAs derived from RNA has further revolutionized the ability to globally analyze gene-expression patterns. Instead of hybridizing cDNA fragments to DNA chips, as shown in the microarray experiment above, the DNA fragments are directly

516

Yeast cells growing in ethanol-rich medium lacking glucose

Yeast cells growing in glucose-rich medium Extract and purify mRNAs

Extract and purify mRNAs

1

mRNAs

mRNAs

Synthesize cDNAs containing green fluorescent dye Cy3

Synthesize cDNAs containing red fluorescent dye Cy5

2

(b)

cDNAs

cDNAs

3

Mix two cDNA populations

4

+

c

5

Incubate microarray with mixed cDNA population (c)

Completed DNA microarray Application of DNA spot onto slide

Chapter 12 Control of Gene Expression

b

Spot DNA from each well onto slide

a

DNA clones (each well contains a different yeast gene)

(a)

(d)

517 Figure 12.35 DNA microarrays and analysis of gene expression. (a) Steps in the construction of a DNA microarray. Preparation of the cDNAs (i.e., DNAs that represent the mRNAs present in a cell) used in the experiment are shown in steps 1–3. Preparation of the DNA microarray is shown in steps a–c. The cDNA mixture is incubated with the microarray in step 4 and a hypothetical result is shown in step 5. The intensity of the color of the spot is proportional to the number of cDNAs bound there. (b) Results of an experiment using a mixture of cDNAs representing mRNAs transcribed from yeast cells (1) in the presence of glucose (green-labeled cDNAs) and (2) in the presence of ethanol after the glucose has been depleted (red-labeled cDNAs). Those spots displaying yellow fluorescence correspond to genes that are expressed under both growth conditions. The inset at the lower right shows a close-up of a small portion of the microarray. The details of this experiment are discussed in the text. (c) Plot showing the changes in glucose and ethanol concentrations in the media and in cell density during the experiment. Initially, the yeast cells consume glucose, then the ethanol they had produced by fermentation, and finally they cease growth after exhausting both sources of chemical energy. (d) Changes in expression of genes encoding TCA-cycle enzymes during the course of the experiment. Each horizontal row depicts the level of expression of a particular gene at different time periods. The names of the genes are provided on the right (see Figure 5.7 for the enzymes). Bright red squares indicate the highest level of gene expression, bright green squares indicate the lowest level of expression. Expression of genes encoding TCA-cycle enzymes is induced as the cells begin metabolizing ethanol and repressed when the ethanol has been exhausted. (C): COURTESY OF PATRICK O. BORWN. SEE JOSEPH DERISI ET AL., SCIENCE 278:680, 1997 AND TRACY L. FEREA AND PATRICK O. BROWN, CURR. OPIN. GEN. DEVELOP. 9:715, 1999. CURRENT OPINION IN GENETICS & DEVELOPMENT BY ELSEVIER LTD. REPRODUCED WITH PERMISSION OF ELSEVIER LTD. IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

genes that a person carries. It is hoped, one day, that this latter information may advise a person of the diseases to which they may be susceptible during their life, giving them an opportunity to take early preventive measures.

The Role of Transcription Factors in Regulating Gene Expression Great progress has been made in elucidating how certain genes can be transcribed in a particular cell, while other genes remain inactive. Transcriptional control is orchestrated by a large number of proteins, called transcription factors. As discussed in Chapter 11, these proteins can be divided into two functional classes: general transcription factors that bind at core promoter sites in association with RNA polymerase

12.4 Transcriptional Control

sequenced and computational approaches are then used to map the fragments back to the genome. This approach allows researchers to identify each fragment, reassemble the fragments into complete transcripts, and quantitate the exact levels of each of the RNA transcripts, providing a profile of the gene-expression patterns of the cells being studied. The latest sequencing technologies allow accurate analyses starting from just nanogram levels of RNA. Figure 12.36 shows an experiment that combined both DNA microarrays and so-called deep sequencing to identify differentially expressed genes between breast cancer cell lines that either express the estrogen receptor (ER) or do not. This is important, because treatment protocols differ depending on whether a particular breast cancer expresses the ER. The experimental results shown in Figure 12.36 examined 129 different breast cancer tumors for the expression of 149 genes that segregate between ER⫺ and ER⫹ tumors. The differences are represented by changes in blue (low expression) or red (high expression) and demonstrate that such analyses can identify the genotype of a breast cancer biopsy, which provides physicians an opportunity to personalize therapy. These technological approaches have many potential uses in addition to providing a visual portrait of gene expression. For example, they can be used to determine the degree of genetic variation in human populations or to identify the alleles for particular

Figure 12.36 Transcription profiling to personalize breast cancer therapy. Microarrays and RNA-sequencing technologies were used to examine gene-expression patterns in 76 estrogen receptor positive and 53 estrogen receptor negative tumors. Each column of tiny squares within the microarray shows the results from a single cancer patient. The data from ER⫹ patients are shown on the left and from ER⫺ patients on the right. A total of 179 genes (listed in the column to the right of the microarray) were identified that are differentially expressed between the two groups. The ability to define a molecular portrait of tumors that will respond to hormone therapy can personalize breast cancer treatment. (FROM ZHIFU SUN ET AL., COURTESY OF E. AUBREY THOMPSON, PLOS ONE 6: e17490, 2011.)

518 Tcf3

Stat3

E2f1

Smad1 Nanog Esrrb Klf2 Sall4

Oct4

Klf5

Sox2 Klf4

Zfx n-Myc

Oct4

Chapter 12 Control of Gene Expression

Figure 12.37 Combinatorial control of transcription. Transcription of the Oct4 gene requires the action of multiple transcription factors that bind upstream of the start site of transcription. Oct4 is the singlemost important (i.e., least dispensable) gene in maintaining the pluripotent state in embryonic stem cells. Note that the Oct4 transcription factor has a role in regulating Oct4 gene transcription. (FROM NG, HUCK-HUI; SURANI, M. AZIM, NATURE CELL BIOLOGY 13:490, 2011. NATURE CELL BIOLOGY BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

(page 442), and sequence-specific transcription factors that bind to various regulatory sites of particular genes. This latter group of transcription factors can act either as transcriptional activators that stimulate transcription of the adjacent gene or as transcriptional repressors that inhibit transcription. We will focus in this section on transcriptional activators, whose job is to bind to specific regulatory sites within the DNA and initiate the recruitment of a large number of protein complexes that bring about the actual transcription of the gene itself. We know the detailed structure of many transcription factors and how they interact with their target DNA sequences. Single genes are usually controlled by many different DNA regulatory sites that bind a combination of different transcription factors (Figure 12.37). Conversely, a single transcription factor may bind to numerous sites around the genome, thereby controlling the expression of a host of different genes. Each type of cell has a characteristic pattern of gene transcription, which is determined by the particular complement of transcription factors contained in that cell. The control of gene transcription is complex, regulated by the presence of multiple binding sites for transcription factors and the binding affinity for these transcription factors to each sequence. Transcription factors have preferred, high-affinity binding sites. A region upstream of a given gene might contain one or more preferred, high-affinity sites or might contain sites with a slightly different sequence that bind with lower affinity. The binding of multiple transcription factors is usually required to activate transcription. An example of this type of cooperative interaction between neighboring transcription factors is illustrated in Figure 12.38 and discussed further on page 522. In a sense, the regulatory region of a gene can be thought of as a type of integration center for that gene’s expression. Cells exposed to different stimuli respond by synthesizing different transcription factors, which bind to different sites in the DNA. The extent to which a given gene is transcribed depends on the particular combination of transcription factors bound to upstream regulatory elements. Given that roughly 5 to 10 percent of genes encode transcription factors, it is apparent that a virtually unlimited number of possible combinations of interactions among these proteins is possible. When this is coupled

Figure 12.38 Interactions between transcription factors bound to different sites in the regulatory region of a gene. This image depicts the tertiary structure of two separate transcription factors, NFAT-1 (green) and AP-1 (whose two subunits are shown in red and blue) bound to the DNA upstream from a cytokine gene involved in the immune response. Cooperative interaction between these two proteins alters the expression of the cytokine gene. The gray bar traces the path of the DNA helix, which is bent as the result of this protein–protein interaction. (FROM TOM K. KERPPOLA, STRUCTURE 6:550, 1998, WITH PERMISSION FROM ELSEVIER.)

to the fact that the binding sites for these factors vary from gene to gene, the combination of the presence or absence of a given factor and variable binding affinity for each factor allows for precise variation in gene- expression patterns between cells of different type, different tissue, different stage of development, and different physiologic state. The Role of Transcription Factors in Determining a Cell’s Phenotype The phenotype of a differentiated cell, such as a fibroblast or an epithelial cell, is generally quite stable. However, such cells can be induced to adopt new phenotypes if they are forced to express certain genes that they would not normally express. It was shown in the late 1980s, for example, that it was possible to induce connective tissue fibroblasts to become muscle cells by forcing the cells to express a single tissue-specific transcription factor, MyoD (shown in Figure 12.42a). As a result of this and other experiments, MyoD became known as a “master regulatory factor” that plays a key role in directing the differentiation of muscle tissue in developing embryos. Another example of the “power” of transcription factors in the determination of a cell’s phenotype is shown in Figure 12.39, which shows the dramatic result of an experiment in which a fruit fly gene called eyeless has been expressed in cells that would not normally express this gene— in this case, cells of the developing leg. Expression of eyeless activates a developmental pathway that leads to the formation

519

Eye

cations and DNA methylation patterns) of the differentiated cell are “erased” and the epigenetic marks characteristic of an ES cell are installed.6 As discussed on page 22, the induced pluripotent cells, or iPS cells as they are called, are capable of dividing indefinitely in culture and of differentiating into all of the various types of the body’s cells. Each pathway of differentiation, whether it leads to formation of a neuron, a fibroblast, or a muscle cell is accompanied by changes in the epigenetic marks that characterize the chromatin of the cells along that pathway. These epigenetic changes have the effect of restricting the developmental potential of the cells along each pathway, making them less capable of developing into other types of cells.

The Structure of Transcription Factors

Figure 12.39 Phenotypic conversion induced by abnormal expression of a single transcription factor. The leg of this fruit fly bears a fully formed eye that has developed as the result of the forced expression of the eyeless gene within cells of the leg primordium during the development of this insect. (COURTESY WALTER GEHRING, UNIVERSITAT BASEL.)

Transcription Factor Motifs The DNA-binding domains of most transcription factors can be grouped into several broad classes whose members possess related structures (motifs) that interact with DNA sequences. The existence of several families of DNA-binding proteins indicates that evolution has found a number of different solutions to the problem of constructing polypeptides that can bind to the DNA double helix. As we will see shortly, most of these motifs contain a segment (often an helix, as in Figure 12.40) that is inserted into the major groove of the DNA, where it recognizes the sequence of base pairs that line the groove. Binding of the protein to the DNA is accomplished by a specific combination of van der Waals forces, ionic bonds, and hydrogen bonds between amino acid residues and various parts of the DNA, including the backbone. Among the most common motifs that occur in eukaryotic DNA-binding proteins are the zinc finger, the helix–loop– helix, and the leucine zipper. Each provides a structurally stable framework on which the specific DNA-recognition surfaces 6 This statement is qualified with the words “for the most part” because recent studies indicate that the reprogramming process is not complete. Instead, the iPS cells retain some of the epigenetic marks of the differentiated cells from which they are derived, which indicates that they are not truly equivalent to embryonic stem cells.

12.4 Transcriptional Control

of an eye situated in a very unusual location of the body. Like MyoD in vertebrate muscle development, the transcription factor Eyeless can be considered to act as a master regulatory factor in the development of eyes in these insects. The most impressive experiments to test the power of transcription factors in directing phenotypic transformations have focused on embryonic stem (ES) cells. As discussed in the Human Perspective in Chapter 1, embryonic stem (ES) cells appear very early in the development of a mammalian embryo and exhibit two key properties: they are (1) capable of indefinite self-renewal and (2) pluripotent; that is, capable of differentiating into all of the different types of cells in the body. It had been known for several years that these pluripotent stem cells had a particular set of transcription factors that played an important role in maintaining their self-renewing, pluripotent state. Just how important these transcription factors were in the biology of ES cells was demonstrated by Shinya Yamanaka and Kazutoshi Takahashi in 2006 when they introduced the genes for 24 various transcription factors into adult mouse fibroblasts. They subsequently found that introducing a combination of genes encoding only four specific transcription factors—Oct4, Sox2, Myc, and Klf4—was sufficient to reprogram the differentiated fibroblasts, converting them into undifferentiated cells that behaved like pluripotent ES cells. The cellular reprogramming process does not occur within a single cell generation but occurs gradually as the cells divide in culture. As the cells become reprogrammed, the four introduced genes are silenced as the cells’ endogenous genes governing pluripotency are activated. Reprogramming is accompanied by a reorganization of the chromatin in such a way that, for the most part, the epigenetic marks (histone modifi-

The three-dimensional structure of numerous DNA–protein complexes has been determined by X-ray crystallography and NMR spectroscopy, providing a basic portrait of the way that these two macromolecules interact with one another. Like most proteins, transcription factors contain different domains that mediate different aspects of the protein’s function. Transcription factors typically contain at least two domains: a DNA-binding domain that binds to a specific sequence of base pairs in the DNA, and an activation domain that regulates transcription by interacting with other proteins. In addition, many transcription factors contain a surface that promotes the binding of the protein with another protein of identical or similar structure to form a dimer (Figure 12.40). The formation of dimers has proved to be a common feature of many different types of transcription factors and plays an important role in regulating gene expression.

520

His

Cys

(a) 50

Chapter 12 Control of Gene Expression

Figure 12.40 Interaction between a transcription factor and its DNA target sequence. A model of the interaction between the two DNA-binding domains of the dimeric glucocorticoid receptor (GR) and the target DNA. Two ␣ helices, one from each subunit of the dimer, are symmetrically extended into two adjacent major grooves of the target DNA. Residues in the dimeric GR that are important for interactions between the two monomers are colored green. The four zinc ions (two per monomer) are shown as purple spheres. (FROM T. HÄRD ET AL., COURTESY OF ROBERT KAPTEIN, SCIENCE 249:159, 1990; © 1990, REPRINTED WITH PERMISSION FROM AAAS.)

(b)

60

70

80

90

NH2

COOH

Figure 12.41 Zinc-finger transcription factors. (a) A model of the complex between a protein with five zinc fingers (called GLI) and DNA. Each of the zinc fingers is colored differently; the DNA is a darker blue. Cylinders and ribbons highlight the ␣ helices and ␤ sheets, respectively. The inset shows the structure of a single zinc finger. (b) A model of TFIIIA bound to the DNA of the 5S RNA gene. TFIIIA is required for the transcription of the 5S rRNA gene by RNA polymerase III. (A: FROM NIKOLA K. PAVLETICH AND CARL O. PABO, SCIENCE 261:1702, 1993; © 1993, REPRINTED WITH PERMISSION FROM AAAS; B: AFTER K. R. CLEMENS ET AL., PROC. NAT ’L. ACAD. SCI. USA 89:10825, 1992.)

of the protein can be properly positioned to interact with the double helix. 1. The zinc-finger motif. The largest class of mammalian

transcription factors contains a motif called the zinc finger. In most cases, a zinc ion of each finger is coordinated between two cysteines and two histidines. The two cysteine residues are part of a two-stranded ␤ sheet on one side of the finger, and the two histidine residues are part of a short ␣-helix on the opposite side of the finger (inset, Figure 12.41a). These proteins typically have a number of such fingers that act independently of one another and are spaced apart so as to project into successive

major grooves in the target DNA, as illustrated in Figure 12.41a. The first zinc-finger protein to be discovered, TFIIIA, has nine zinc fingers (Figure 12.41b). Other zinc-finger transcription factors include Egr, which is involved in activating genes required for cell division, and the GATA family of transcription factors, which are involved in multiple developmental events, including cardiac muscle development. Comparison of a number of zinc-finger proteins indicates that the motif provides the structural framework for a wide variety of amino acid

521

(a) 3' 5'

108

n

egio ic r

Bas 124 x1

166 H

elix 2

Heli

137 146 Loop

5' 3'

(b)

Figure 12.42 Basic helix–loop–helix (bHLH) transcription factors. (a) MyoD, a dimeric transcription factor involved in triggering muscle cell differentiation, is a bHLH protein that binds to the DNA by an associated basic region. The basic region of each MyoD monomer is red, whereas the helix–loop–helix region of each MyoD monomer is shown in brown. The DNA bases bound by the transcription factor are indicated in yellow. (b) A sketch of the dimeric MyoD complex in the same orientation as in part a. The helices are represented as cylinders. (FROM PCM MA ET AL., COURTESY OF CAROL O. PABO, CELL 77:453, 1994; CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/ TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

B-B homodimer

A-B heterodimer

Figure 12.43 Modifying the DNA-binding specificities of transcription factors through dimerization. In this model of a bHLH protein, three different dimeric transcription factors that recognize different DNA-binding sites can be formed when the two subunits associate in different combinations. The human genome encodes approximately 118 different bHLH monomers, which could potentially generate thousands of different dimeric transcription factors.

12.4 Transcriptional Control

A-A homodimer

sequences that recognize a diverse set of DNA sequences. In fact, researchers are attempting to design new species of zinc-finger proteins that are capable of targeting DNA sequences that govern the expression of particular genes of interest. It is hoped that such proteins might have the potential to act as therapeutic drugs by turning on or off the expression of disease-related genes. 2. The helix–loop–helix (HLH) motif. As the name implies, this motif is characterized by two -helical segments separated by an intervening loop. The HLH domain is often preceded by a stretch of highly basic amino acids whose positively charged side chains contact the DNA and determine the sequence specificity of the transcription factor. Proteins with this basic-HLH (or bHLH) motif always occur as dimers, as illustrated in the example of the transcription factor MyoD in Figure 12.42. The two subunits of the dimer are usually encoded by different genes, making the protein a heterodimer. Heterodimerization greatly expands the diversity of regulatory factors that can be generated from a limited number of polypeptides (Figure 12.43). Suppose, for example, that a cell were to synthesize five different bHLHcontaining polypeptides that were capable of forming heterodimers with one another in any combination; then 32 (25) different transcription factors that recognize 32 different DNA sequences could conceivably be formed. In actuality, combinations between polypeptides are restricted, not unlike the formation of heterodimeric integrin molecules (page 244). HLH-containing transcription factors play a key role in the differentiation of certain types of tissues, including skeletal muscle (as illustrated in Figure 12.34). HLH-containing transcription factors also participate in the control of cell proliferation and have been implicated in the formation of certain cancers. It was noted on page 504 that chromosome translocations can generate abnormal genes whose expression causes the cell to become cancerous. Genes encoding at least four different bHLH proteins (MYC, SCL, LYL-1, and E2A) have been found in chromosome translocations that lead to the development of specific cancers. The most prevalent of these cancers is Burkitt’s lymphoma in which the MYC gene on chromosome 8 is translocated to a locus on chromosome 14 that contains a regulatory site for a gene encoding part of an antibody molecule. The overexpression of the MYC gene in its new location is thought to be a significant factor in development of this lymphoma.

522 3. The leucine zipper motif. This motif gets its name from

the fact that leucines occur every seventh amino acid along a stretch of helix. Because an helix repeats every 3.5 residues, all of the leucines along this stretch of polypeptide face the same direction. Two helices of this type are capable of zipping together to form a coiled coil. Thus, like most other transcription factors, proteins with a leucine zipper motif exist as dimers. The leucine zipper motif can bind DNA because it contains a stretch of basic amino acids on one side of the leucine-containing helix. Together the basic segment and leucine zipper are referred to as a bZIP motif. Thus, like bHLH proteins, the -helical portions of bZIP proteins are important in dimerization, while the stretch of basic amino acids allows the protein to recognize a specific nucleotide sequence in the DNA. AP-1, whose structure and interaction with DNA are shown in Figure 12.38, is an example of a bZIP transcription factor. AP-1 is a heterodimer whose two subunits (Fos and Jun, shown in red and blue in Figure 12.38, respectively) are encoded by the genes FOS and JUN. Both of these genes play an important role in cell proliferation and, when mutated, can contribute to a cell becoming malignant. Mutations in either of these genes that prevent the proteins from forming heterodimers also prevent the proteins from binding to DNA, indicating the importance of dimer formation in regulating their activity as transcription factors.

Chapter 12 Control of Gene Expression

DNA Sites Involved in Regulating Transcription The complexity inherent in the control of gene transcription can be illustrated by examining the DNA in and around a single gene. In the following discussion, we will focus on the gene that encodes phosphoenolpyruvate carboxykinase (PEPCK). PEPCK is one of the key enzymes of gluconeogenesis, the metabolic pathway that converts pyruvate to glucose (see Figure 3.31). The enzyme is synthesized in the liver when glucose levels are low, as might occur, for example, when a considerable period of time has passed since one’s last meal. Conversely, synthesis of the enzyme drops sharply after ingestion of a carbohydrate-rich meal, such as pasta. The level of synthesis of PEPCK mRNA is controlled by a variety of different transcription factors (Figure 12.44), including a number of receptors for hormones that are involved in regulating carbohydrate metabolism. The keys to understanding the regulation of PEPCK gene expression lie in (1) unraveling the functions of the numerous DNA regulatory sequences that reside upstream from the gene itself, (2) identifying the transcription factors that bind these sequences, and (3) elucidating the signaling pathways that activate the machinery responsible for selective gene expression (discussed in Chapter 15). The closest (most proximal) regulatory sequence upstream of the PEPCK gene is the TATA box, which is a major element of the gene’s promoter (Figure 12.45). As discussed on page 442, a promoter is a region upstream from a gene that regulates the initiation of transcription. For our purposes, we will divide the eukaryotic promoter into separate regions, which are not that well delineated. The region that stretches

Fos/Jun T3 HNF-3 receptor PPARγ/RXR Insulin GR RAR

AF1 PPARRE

-1000

-500

GRE TRE IRE

-400

Fos/Jun C/EBP HNF-1 NF-1

P2

P3 II P4

-300

CREB/CREM

P3 I

-200

P1

DBP C/EBP TBP

Pol II

TATA

CRE-1

-100

0

Figure 12.44 Regulating transcription from the rat PEPCK gene. Transcription of this gene, like others, is controlled by a combination of transcription factors that interact with specific DNA sequences located in a regulatory region upstream from the gene’s coding region. Included within this region is a glucocorticoid response element (GRE) that, when bound by a glucocorticoid receptor, stimulates transcription from the promoter. Also included within the regulatory region are binding sites for a thyroid hormone receptor (labeled TRE); a protein that binds cyclic AMP, which is produced in response to the hormone glucagon (labeled CRE-1); and the hormone insulin (labeled IRE). A number of other transcription factors are also seen to bind to regulatory sites in this region upstream from the PEPCK gene. (FROM S. E. NIZIELSKI ET AL., J. NUTRITION 126:2699, 1996; © AMERICAN SOCIETY FOR NUTRITIONAL SCIENCES.)

from roughly the TATA box to the transcription start site is designated the core promoter. The core promoter is the site of assembly of a preinitiation complex consisting of RNA polymerase II and a number of general transcription factors (GTFs) that are required before a eukaryotic gene can be transcribed. The core promoter elements, and the GTFs that bind to them (shown in Figure 11.18), define the exact site at which transcription will start (the 1 site). The TATA box is not the only short sequence that is shared by large numbers of genes. Two other sequences, called the CAAT box and GC box, which are located farther upstream from the gene, are often required for RNA polymerase II to initiate transcription. The CAAT and GC boxes bind transcription factors (e.g., NF1 and SP1) that are found in many tissues and widely employed in gene expression. Whereas the TATA box determines the site of initiation of transcription, the CAAT and GC boxes are two of many sequences that regulate the frequency with which RNA polymerase II initiates transcription. The TATA box is within the core promoter, usually within 25–30 bases from the start site of transcription. The CAAT and GC boxes, when present, are typically located within 100–150 base pairs upstream from the transcription start site, a region referred to as the proximal promoter region, as indicated in the top line of Figure 12.45. The more genes that are examined, the more variability that is found in the nature and locations of the promoter elements that regulate gene expression. In addition, a significant number of mammalian genes (possibly as many as 50 percent of them) have more than one promoter (i.e., alternative promoters) allowing initiation of transcription to occur at more than one site upstream from a gene. Alternative promoters are typically separated from one another by several hundred bases so that the primary transcripts produced by their differential

523 Distal promoter elements

Proximal promoter elements Relative promoter activity

Core promoter elements Complete promoter (control)

-130

-120

-110

-100

-90

-80

-60

-70

-50

-40

-30

-20

-10

+1

CGTGCTGACCATGGCTATGATCCAAAGGCCGGCCCCTTACGTCAGAGCGAGC CTCCAAGGTCCAGCTGAGGGGCAGGGCTGTCCTCCCTTCTGTATACTATTTAAAGCGAGGAGGGCTAGCTACCAAGCA

CAAT Box

100%

TATA Box

GC Box

Promoter with deletions

38% 95% 14% 50% 114%

Figure 12.45 Identifying promoter sequences required for transcription. The top line shows the nucleotide sequence of one strand of the PEPCK gene promoter. The TATA box, CAAT box, and GC box are indicated. The other five lines show the results of experiments in which particular regions of the promoter were deleted (indicated by black boxes). The observed level of transcription of the PEPCK gene in each of these cases is indicated at the right. Deletions that remove all or part of the three boxes cause a marked decrease in

usage will differ considerably at their 5 ends. In some cases, alternative promoters are utilized in different tissues, but they lead to synthesis of the same polypeptide. In such cases, the mRNAs that encode the proteins are likely to have different 5 UTRs, which makes them subject to distinct types of translational-level control. In other cases, alternative promoters promote the synthesis of related proteins that differ in their N-terminal amino acid sequence. In a number of cases, alternative promoters govern the transcription of mRNAs that are translated in different reading frames (page 468) to produce entirely different polypeptides. How do researchers learn which sites in the genome interact with a particular transcription factor? The following general strategies are often employed to make this determination. ■

■

■

regions might have little to no effect (lines 3 and 6 of Figure 12.45). DNA footprinting. When a transcription factor binds to a DNA sequence, the presence of the protein protects the DNA sequence from digestion by nucleases. Researchers take advantage of this property by isolating chromatin from cells and treating it with DNA-digesting enzymes, such as DNase I. Regions that are not protected by proteins are digested, whereas those with bound proteins are protected. Once the chromatin has been digested, the bound protein is removed, and the protected DNA sequences are identified. Genome-wide location analysis. As the name implies, this strategy allows researchers to simultaneously monitor all of the sites within the genome that are bound by a given transcription factor under a given set of physiologic conditions. An outline of the approach is shown in Figure 12.46. To carry out this analysis, cells are cultured under the desired conditions or isolated from a particular tissue or developmental stage, and then treated with an agent, such as formaldehyde, that will kill the cells and cross-link transcription factors to the DNA sites at which they were bound in the living cell (step 1, Figure 12.46). Following the cross-linking step, chromatin is isolated, sheared into small fragments, and incubated with an antibody that binds to the transcription factor of interest (step 2). Binding of the antibody leads to the precipitation of chromatin fragments containing the bound transcription factor (step 3a), while leaving all of the unbound chromatin fragments

12.4 Transcriptional Control

Deletion mapping. In this procedure, DNA molecules are prepared that contain deletions of various parts of the gene’s promoter (Figure 12.45). The altered DNAs are then introduced into cells and the ability of the deletion mutants to initiate transcription is measured. In many cases, the deletion of a few nucleotides has little or no effect on the transcription of an adjacent gene. However, if the deletion falls within a region that prevents binding of transcription factors that activate transcription, the level of transcription will decrease (as in lines 2, 4, and 5 of Figure 12.45). Deletion of other regions might increase transcription, which would indicate that a transcription factor binds to this region that inhibits transcription, in other words, a transcription repressor. Finally, deletion of other

the level of transcription, whereas deletions that affect other regions have either a lesser or no effect. (As noted in Chapter 11, many mammalian promoters lack the TATA box. Promoters lacking the TATA box often have a conserved element at a position downstream of the start site, called the downstream promoter element, or DPE.) (DATA TAKEN FROM STUDIES BY RICHARD W. HANSON AND DARYL K. GRANNER AND THEIR COLLEAGUES.)

524

cated. Two different approaches can be taken to make this determination: either the purified DNA is fluorescently labeled and used in microarrays (steps 5–7) in a technique called ChIP-chip, or the purified DNA can be directly sequenced using massively parallel sequencing approaches (step 5a) in a technique called ChIP-seq. Unlike the microarray shown in Figure 12.35, which contains DNA probes representing protein-coding genes, the microarrays used in these ChIP-chip experiments contain DNA probes from across the entire genome. Whether by microarray or by direct sequencing, the experiments enable identification of specific binding sites across the entire genome for any factor of interest.

Petri dish with cells 1

Treat with formaldehyde to kill cells and cross-link transcription factors to DNA. Isolate chromatin and shear it into fragments.

Transcription factors

2

Incubate with antibody that binds to transcription factor of interest.

Antibody

3a

3b

Immunoprecipitated chromatin fragments 4

5

6

Chromatin fragments remain in solution

Reverse cross-links, purify DNA. Sequence .....AGCTAC..... DNA .....CTACG..... .....AGTTA..... 5a Amplify DNA fragments ChIP-seq with fluorescently labeled nucleotides.

Hybridize to DNA microarray containing intergenic regions.

Spot containing known intergenic DNA

Chapter 12 Control of Gene Expression

ChIP-chip 7

Figure 12.46 Use of chromatin immunoprecipitation (ChIP) to identify transcription-factor binding sites on a global scale. The steps are described in the text.

in solution (step 3b). Once this process of chromatin immunoprecipitation (or ChIP ) has been carried out, the cross-links between the protein and DNA in the precipitate can be reversed and the segments of DNA can be purified (step 4). The next step is to identify where in the genome these transcription-factor binding sites are lo-

Interestingly, when these types of experiments are performed with mammalian transcription factors, a significant percentage of the DNA binding sites are located at considerable distances from known gene promoters. It is speculated that some of these sites are involved in regulating transcription of noncoding RNAs, such as the microRNAs discussed on page 459. It is also likely that many of the thousands of genetic loci identified in these genome-wide screens have little or no real importance in the regulation of gene expression. One way to assess the likely importance of the binding of a transcription factor is to consider the data in a larger context. For example, a number of transcription factors, such as Oct4, Sox2, and Nanog, are important in maintaining the pluripotent state of embryonic stem cells (page 21). When tested alone, each of these three transcription factors are bound to several thousand sites within the genome of embryonic stem cells. In contrast, only a small fraction of these collected genomic regions is bound by all three of these key transcription factors. The presence of all three transcription factors bound in close proximity to one another provides much greater confidence that a given region is functionally important in regulating gene expression in these cells. In addition to studying transcription factors, the ChIP technique can be modified to locate the positions within the genome of any other type of DNA-bound protein, such as a particular type of modified histone or even the locations of bound RNA polymerase molecules. The Glucocorticoid Receptor: An Example of Transcriptional Activation Transcription can be said to be controlled by a combinatorial code consisting of an array of transcription factors and changes in their preferred binding sites from gene to gene. The sites at which these transcription factors bind to the region flanking a gene are often called response elements. Here, we will focus on the glucocorticoids, a group of steroid hormones (e.g., cortisol) synthesized by the adrenal gland. Analogues of these hormones, such as prednisolone, are prescribed as potent anti-inflammatory agents. Secretion of glucocorticoids is highest during periods of stress, as occurs during starvation or following a severe physical injury. For a cell to respond to glucocorticoids, it must possess a specific receptor capable of binding the hormone. The glucocorticoid receptor (GR) is a member of a superfamily of nuclear receptors (including receptors for thyroid hormone,

525

retinoic acid, and estrogen) that are thought to have evolved from a common ancestral protein. Members of this superfamily are more than just hormone receptors, they are also DNAbinding transcription factors. When a glucocorticoid hormone enters a target cell, it binds to a glucocorticoid receptor protein in the cytosol, changing the conformation of the protein. This change exposes a nuclear localization signal (page 492), which facilitates translocation of the receptor into the nucleus (Figure 12.47). The ligand-bound receptor binds to a specific DNA sequence, called a glucocorticoid response element (GRE). For the PEPCK gene, this site is located upstream from the core promoter (Figure 12.44), and binding activates transcription of the gene (Figure 12.47). Identical or similar GRE sequences are located upstream from a number of other genes on different chromosomes. If the site is the exact preferred binding site, the gene will be highly responsive to elevated glucocorticoid levels, whereas genes with imperfect binding sites will respond in accordance with the change in binding affinity for the glucocorticoid-receptor complex. Consequently, a single stimulus (elevated glucocorticoid concentration) can simultaneously activate a number of genes, each at its own precise level, allowing a comprehensive, finely tuned response that meets the needs of the cell. The preferred glucocorticoid response element consists of the following sequence 5¿-AGAACAnnnTGTTCT-3¿ 3¿-TCTTGTnnnACAAGA-5¿ where n can be any nucleotide. A symmetrical sequence of this type is called a palindrome because the two strands have the same 5 to 3 sequence. The GRE is seen to consist of two defined stretches of nucleotides separated by three undefined

nucleotides. The twofold nature of a GRE is important because pairs of GR polypeptides bind to the DNA as dimers in which each subunit of the dimer binds to one-half of the DNA sequence indicated above (see Figure 12.40). The importance of the GRE in mediating a hormone response is most clearly demonstrated by introducing one of these sequences into the upstream region of a gene that normally does not respond to glucocorticoids. When cells containing DNA that has been engineered in this way are exposed to glucocorticoids, transcription of the gene downstream from the transplanted GRE is initiated. Visual evidence that gene transcription can be stimulated by foreign regulatory regions is presented in Figure 18.47.

Transcriptional Activation: The Role of Enhancers, Promoters, and Coactivators The GRE situated upstream from the PEPCK gene, and the other response elements illustrated in Figure 12.44, are referred to as distal promoter elements to distinguish them from the proximal promoter elements situated closer to the gene or the core promoter which dictates the site of initiation (Figure 12.45). The expression of most genes is also regulated by even more distant DNA elements called enhancers. An enhancer typically contains multiple binding sites for sequence-specific transcriptional activators. Enhancers are often distinguished from promoter elements by a unique property: they can be situated either upstream or downstream of the start site, and they can even be inverted (rotated 180 ), without affecting the ability of a bound transcription factor to stimulate transcription. Deletion of an enhancer can decrease the level of transcription by 100-fold or more. A typical mammalian gene may have a number of enhancers scattered within the DNA in

Extracellular fluid Cortisol

Cytoplasm 1 6 2

Nucleus mRNA 4

5

3

DNA

Figure 12.47 Activation of a gene by a steroid hormone. The hormone cortisol, a glucocorticoid, enters the cell from the extracellular fluid (step 1), diffusing through the lipid bilayer (step 2) and into the cytoplasm, where it binds to a glucocorticoid receptor (step 3), changing its conformation and causing it to translocate into the

RNA polymerase

nucleus, where it acts as a transcription factor and binds to a glucocorticoid response element (GRE) of the DNA (step 4). The glucocorticoid receptor binds to the GRE as a dimer, which activates transcription of the DNA (step 5), leading to the synthesis of specific proteins in the cytoplasm (step 6).

12.4 Transcriptional Control

Glucocorticoid receptor

Protein synthesized

526

Chapter 12 Control of Gene Expression

the vicinity of the gene. Different enhancers typically bind different sets of transcription factors and respond independently to different stimuli. Some enhancers are located thousands or even tens of thousands of base pairs upstream or downstream from the gene whose transcription they stimulate.7 Even though enhancers and promoters may be separated by large numbers of nucleotides, enhancers are thought to stimulate transcription by influencing events that occur at the core promoter. Enhancers and core promoters can be brought into proximity because the intervening DNA can form a loop through the interactions of bound proteins (Figure 12.48). If enhancers can interact with promoters over such long distances, what is to prevent an enhancer from binding to an inappropriate promoter located even farther downstream on the DNA molecule? A promoter and its enhancers are, in essence, “cordoned off ” from other promoter/enhancer elements by specialized boundary sequences called insulators. One of the most active areas of molecular biology in the past decade or so has focused on the mechanism by which a transcriptional activator bound to an enhancer is able to stimulate the initiation of transcription at the core promoter. Transcription factors accomplish this feat through the action of intermediaries known as coactivators. Coactivators are large complexes that consist of numerous subunits. Coactivators can be broadly divided into two functional groups: (1) those that interact with components of the basal transcription machinery (the general transcription factors and RNA polymerase II) and (2) those that act on chromatin, converting it from a state that is relatively inaccessible to the transcription machinery to a state that is much more transcription “friendly.” Figure 12.48 shows a schematic portrait of four types of coactivators, two of each of the major groups. These various types of coactivators work together in an orderly manner to activate the transcription of particular genes in response to specific intracellular signals. Given the large number of transcription factors encoded in the genome, and the limited diversity of coactivators, each coactivator complex operates in conjunction with a wide variety of different transcription factors. The coactivator CBP, for example, which is discussed below, participates in the activities of hundreds of different transcription factors. Coactivators that Interact with the Basal Transcription Machinery Transcription is accomplished through the recruitment and subsequent collaboration of large protein complexes. TFIID, one of the GTFs required for initiation of transcription (page 442), consists of a dozen or more subunits. The key component is TATA binding protein (TBP) which binds the TATA box. Its binding is facilitated by a number of associated factors denoted as TAFs (TBP-associated factors). Some transcription factors are thought to influence events at the core promoter by interacting with one or more of these TFIID subunits. Another coactivator that communicates di7

There is wide variation in the terminology used to describe the various types of regulatory elements that control gene transcription. The terms employed here— core promoter, proximal promoter elements, distal promoter elements, and enhancers—are not universally adopted, but they describe elements that are usually present and often (but not always) capable of being distinguished.

Enhancer

Insulator

Transcriptional activator

Histone modification complex Chromatin remodeling complex

Mediator TAFs RNAPII TBP Nucleosomes

Figure 12.48 The mechanisms by which transcriptional activators bound at distant sites can influence gene expression. Transcriptional activators bound at upstream enhancers influence gene expression through interaction with coactivators. Four different types of coactivators are depicted here, two of them labeled “Histone modification complex” and “Chromatin remodeling complex” act by altering the structure of chromatin. Two others, labeled “TAFs” and “Mediator” act on components of the basal transcription machinery that assembles at the core promoter. These various types of coactivators are discussed in the following sections.

rectly between enhancer-bound transcription factors and the basal transcription machinery is called Mediator, which is a huge, multisubunit complex that interacts directly with RNA polymerase II. Mediator is required by a wide variety of transcriptional activators and may be an essential element in the transcription of most, if not all, protein-coding genes. Despite considerable effort, the mechanism of action of Mediator remains unclear. Coactivators that Alter Chromatin Structure The discovery of nucleosomes in the 1970s raised an important question that still has not been completely answered: how are nonhistone proteins (such as transcription factors and RNA polymerases) able to interact with DNA that is tightly wrapped around core histones? A large body of evidence suggests that incorporation of DNA into nucleosomes does, in fact, impede access to DNA and markedly inhibits both the initiation and elongation stages of transcription. How do cells overcome this inhibitory effect that results from chromatin structure? As discussed on page 495, each of the histone molecules of the nucleosome core has a flexible N-terminal tail that extends outside the core particle and past the DNA helix. As discussed earlier, covalent modifications of these tails have a profound impact on chromatin structure and function. We have already seen how the addition of methyl groups on H3K9 residues can promote chromatin compaction and transcriptional silencing (page 501). The addition of acetyl groups to specific lysine residues in core histones tends to have the opposite effect. On a larger scale, acetylation of histone residues is thought to prevent chromatin fibers from folding into compact structures, which helps to maintain active, eu-

527

chromatic regions. On a finer scale, histone acetylation increases access of specific regions of the DNA template to interacting proteins, which promotes transcriptional activation. As we will see shortly, the enzymes that carry out these histone modifications are parts of large multiprotein complexes involved in regulating gene expression. Techniques have been developed in recent years to determine the nature of the modifications in the histones of nucleosomes on a genome-wide scale. These techniques involve a ChIP technique similar to that illustrated in Figure 12.46, but rather than determining the genome-wide location of particular transcription factors, the goal is to pinpoint the precise locations of particular histone modifications within the genome. Rather than using antibodies against transcription factors to precipitate a particular fraction of protein-bound DNA segments, researchers use antibodies that specifically recognize particular histone modifications, such as acetylated H3K9 or H4K16, or methylated H3K4 or H3K36. All four of these histone marks had been known to be associated with transcriptionally active genes (Figure 12.18), but it wasn’t known how these marks might be distributed within such genes. Are specific histone modifications spread uniformly throughout each gene, or are there differences from one end to the other? Results from analysis of active genes in the yeast genome are shown in Figure 12.49 and reveal marked differences within various parts of these genes. Acetylated H3 and H4 histones and methylated H3K4 residues are clustered in the promoter regions and largely absent from the main body of active genes, suggesting that these modifications are primarily important in the activation of a gene or the initiation of its transcription. In contrast, methylated H3K36 residues are largely absent from the promoters and instead are concentrated in the transcribed regions of active genes. Several studies provide a rationale for these differences in location and

Promoter Transcribed region TSS

H3/H4 Acetylation H3K4 Methylation

Figure 12.49 Histone modifications can act as signatures of transcribed chromatin regions. The figure depicts selective localization of histone modifications within the chromatin of transcribed genes based on genome-wide analyses in yeast. Histone acetylation and H3K4 methylation are localized primarily in the promoter region of active genes and decrease markedly in the transcribed portion of the gene. In contrast, H3K36 methylation displays the opposite localization pattern. TSS, transcription start site.

12.4 Transcriptional Control

H3K36 Methylation

give an example of the complex relationships between histone modifications. As described on page 443, the initiation of transcription is correlated with the phosphorylation of Ser5 residues in the CTD of RNA polymerase II. These phosphorylated Ser5 residues serve as a recognition platform for the histone methyltransferase Set1, which methylates H3K4 in the chromatin of the promoter. Methylated H3K4 residues, in turn, serve as binding sites for a number of protein complexes, including ones involved in chromatin remodeling (see Figure 12.51) and pre-mRNA splicing (see Figure 11.32). In contrast, methylation of H3K36 is catalyzed by the histone methyltransferase Set2, which travels with the elongating polymerase. Once this residue is methylated, it serves as a recruitment site for another enzyme complex (Rpd3S) that catalyzes the removal of acetyl groups from lysine residues of histones in the transcribed portion of the gene. Evidence suggests that removal of acetyl groups from nucleosomes in the wake of an elongating RNA polymerase prevents the inappropriate initiation of transcription within the internal coding region of a gene. Let’s look more closely at events that occur at the promoter regions of genes during the activation of transcription. Acetyl groups are added to specific lysine residues on the core histones by a family of enzymes called histone acetyltransferases (HATs). In the late 1990s, it was discovered that a number of coactivators possessed HAT activity. If the HAT activity of these coactivators was eliminated by mutation, so too was their ability to stimulate transcription. The discovery that coactivators contain HAT activity provided a crucial link between histone acetylation, chromatin structure, and gene activation. Figure 12.50 shows an ordered series of reactions that has been proposed to occur following the binding of a transcriptional activator, such as the glucocorticoid receptor, to its response element on the DNA. Once bound to the DNA, the activator recruits a coactivator (e.g., CBP) to a region of the chromatin that is targeted for transcription. Once positioned at the target region, the coactivator acetylates the core histones of nearby nucleosomes, which exposes a binding site for a chromatin remodeling complex (discussed below). The combined actions of these various complexes increase the accessibility of the promoter to the components of the transcription machinery, which assembles at the site where transcription will be initiated. Figure 12.50 depicts the activity of two coactivators that affect the state of chromatin. We have already seen how the HATs act to modify the histone tails; let us look more closely at the other type of coactivator, the chromatin remodeling complexes. Chromatin remodeling complexes use the energy released by ATP hydrolysis to alter nucleosome structure and location along the DNA. This in turn may allow the binding of various proteins to regulatory sites in the DNA. The best studied chromatin remodeling machines are members of the SWI/SNF family. SWI/SNF complexes, which consist of 9–12 subunits, do not bind to specific DNA sequences but rather are recruited to specific promoters by either epigenetic marks present on nucleosomal histones or other proteins already bound to the DNA. In Figure 12.50, the coactivator CBP has acetylated the core histones, providing a high-affinity

528 Glucocorticoid receptor (GR) Deacetylated histones

GRE

1

TATA CBP (Histone acetyltransferase)

Acetylated histones

2

TATA CBP (Histone acetyltransferase)

Figure 12.50 A model describing the activation of transcription. Transcription factors, such as the glucocorticoid receptor (GR), bind to the DNA and recruit coactivators, which facilitate the assembly of the transcription preinitiation complex. Step 1 of this drawing depicts a region of a chromosome that is in a repressed state because of the association of its DNA with deacetylated histones. In step 2, the GR is bound to the GRE, and the coactivator CBP has been recruited. CBP contains a subunit that has histone acetyltransferase (HAT) activity. These enzymes transfer acetyl groups from an acetyl CoA donor to the amino groups of specific lysine residues on histone proteins. As a result, histones of the nucleosome core particles in the regions both upstream and downstream from the TATA box become acetylated. In step 3, the acetylated histones recruit SWI/SNF, which is a chromatin remodeling complex. Together, the two coactivators CBP and SWI/SNF change the structure of the chromatin to a more open, accessible state. In step 4, TFIID binds to the open region of the DNA. One of the subunits of TFIID (called TAFII250 or TAF1) also possesses histone acetyltransferase activity as indicated by the red arrow. Together, CBP and TAFII250 modify additional nucleosomes to allow transcription initiation. In step 5, the remaining nucleosomes of the promoter have been acetylated, RNA polymerase II is bound to the promoter, and transcription is set to begin.

Acetylated histones

3

SWI/SNF (Chromatin remodeling complex) Acetylation TAFII250

TATA

4

TBP

Other general transcription factors (see Fig. 11.18)

Chapter 12 Control of Gene Expression

TAFII250

TATA

TBP

RNA Pol II 5

binding site for the remodeling complex. Once recruited to a promoter, chromatin remodeling complexes are thought to disrupt histone–DNA interactions, which can 1. promote the mobility of the histone octamer so that it slides along the DNA to new positions (Figure 12.51, path 1). In the best studied case, the binding of transcriptional activators to an enhancer upstream from the IFN- gene

leads to the sliding of a key nucleosome approximately 35 base pairs along the DNA, exposing the TATA box that had been previously covered by histones. Sliding occurs as the remodeling complex translocates along the DNA. 2. change the conformation of the nucleosome. In the example depicted in Figure 12.51, path 2, the DNA has formed a transient loop or bulge on the surface of the histone octamer, making that site more accessible for interaction with DNA-binding regulatory proteins. 3. facilitate the replacement within the histone octamer of a standard core histone by a histone variant (page 495) that is correlated with active transcription. For example, the SWR1 complex exchanges H2A/H2B dimers with a variant histone (H2A.Z) that dimerizes with H2B. (Figure 12.51, path 3). 4. displace the histone octamer from the DNA entirely (Figure 12.51, path 4). For example, nucleosomes are thought to be temporarily disassembled as an elongating RNA polymerase complex moves along the DNA within a gene. Evidence has accumulated for many years that nucleosomes are not positioned randomly along the DNA and, more importantly, that specific nucleosome positioning plays an important role in regulating transcription of specific regions of the genome. A number of attempts have been made to determine whether nucleosomes are preferentially present or absent from particular sites within the genome in a given type of cell. To carry out these studies, the chromatin is typically treated with a nuclease that digests those portions of the DNA that are not protected by their association with histone octamers. The DNA that has been protected from nuclease digestion is then dissociated from its associated histones and sequenced. These techniques have allowed researchers to prepare genome-wide maps of nucleosome positions along the DNA. The most complete analyses of nucleosome positioning have been carried out

529

TATA

Figure 12.51 Chromatin remodeling. In pathway 1, a key nucleosome slides along the DNA, thereby exposing the TATA binding site and allowing the preinitiation complex to assemble. In pathway 2, the histone octamer of a nucleosome has been reorganized. Although the TATA box is not completely free of histone association, it is now able to bind the proteins of the preinitiation complex. In pathway 3, the standard H2A/H2B dimers of a nucleosome have been exchanged with histone variants (e.g., H2A.Z/H2B dimers) that associate with active chromatin. In pathway 4, the histone octamer has been disassembled and is lost from the DNA entirely.

on yeast cells, and they provide a somewhat different picture from the traditional view that is based on years of biochemical studies on the transcription of individual genes in mammalian cells that was presented in Figure 12.50.

Studies on yeast cells suggest that the majority of genes share a common chromatin architecture, which is depicted in Figure 12.52. As indicated in this figure, nucleosomes are not randomly distributed along the DNA but instead are surprisingly well positioned. Most notably, the promoter sequences tend to reside within nucleosome-free regions (NFRs of Figure 12.52), which are flanked on either side by two very wellpositioned nucleosomes identified as 1 and 1 in Figure 12.52. The nucleosome-free region surrounding the promoter is likely an important consideration in allowing access by regulatory factors to these target sites in the DNA. Of all the nucleosomes in a yeast gene, the 1 nucleosome undergoes the most extensive modifications upon transcriptional activation. This nucleosome may be moved to a new site, evicted from the DNA, or undergo extensive histone modifications or histone exchange. The downstream nucleosomes (1, 2, 3, etc.) are also subject to many of these same alterations during transcriptional activation, but the degree to which a nucleosome is affected diminishes with the distance it is located from the transcriptional start site. It is not clear yet whether

Sliding

1

Histone 3 exchange

2

4

Conformational change

Histone Dissociation

Displaced nucleosome

RNA polymerase

the figure. The subsequent nucleosomes near the 5 end of the transcribed region also tend to be well positioned, as indicated by the distinct locations of these nucleosomes at the top of the drawing. A nucleosome is also seen to be displaced by RNA polymerase as it transcribes the gene. The green shading corresponds to a region of chromatin with high levels of the H2A.Z histone variant, high histone acetylation and H3K4 methylation, and a high likelihood of nucleosomal positioning. (FROM CIZHONG JIANG AND B. FRANKLIN PUGH, NATURE REVS. GEN. 10:164, 2009; © COPYRIGHT 2009, MACMILLAN MAGAZINES LIMITED, ADAPTED FROM T. N. MAVRICH, ET AL., GENOME RES. 18:1074, 2008.)

12.4 Transcriptional Control

Figure 12.52 The nucleosomal landscape of yeast genes. The top portion of the illustration shows a typical region of DNA in the vicinity of a gene that is being actively transcribed by an RNA polymerase II complex. The consensus positions of nucleosomes are illustrated by the gray ovals. The lower portion of the illustration shows the distribution of nucleosomes along the DNA with the height of the line reflecting the likelihood that a nucleosome is found at that site. The 5 region of the gene has the most highly defined chromatin architecture. There is a very high likelihood that the promoter region is bare of nucleosomes (a nucleosome-free region or NFR) blanked by two very well-positioned nucleosomes that lie on either side of the transcription start site (green light); these two nucleosomes are identified as 1 and 1 at the top of

530

the promoter regions of most nontranscribed genes in higher eukaryotes tend to be relatively covered in nucleosomes as depicted in Figure 12.50 (referred to as “closed promoters”) or relatively free of nucleosomes as suggested in Figure 12.52 (referred to as “open promoters”). Even if the transcription start site is covered by a nucleosome, studies suggest that this nucleosome may be less stable than its neighbors, containing histone variants (H3.3 and H2A.Z) in place of the standard core histones. Regardless of the specific nucleosome topology, the promoter regions of chromatin are the targets of a wide variety of histone modifying enzymes (e.g., histone acetyltransferases, deacetylases, methyltransferases, and demethylases), chromatin remodeling complexes (e.g., SWI/ SNF), and gene-specific transcription factors.

Chapter 12 Control of Gene Expression

Transcriptional Activation from Paused Polymerases Throughout this discussion of transcriptional activation, we have described how transcription factors bind to specific DNA sequences and induce the recruitment of general transcription factors (GTFs), chromatin modification complexes, and RNA polymerase II, which leads to the initiation of transcription. It came as a surprise to learn that RNA polymerases are also bound to the promoters of many genes that show no evidence they are being transcribed. In some cases, an RNA polymerase situated at one of these “transcriptionally silent” genes initiates RNA synthesis, but the polymerase fails to transition to the elongation stage of transcription. In other cases, the polymerase goes on to synthesize an RNA of about 30 nucleotides and then becomes stalled. In either case, a fulllength primary transcript is never generated. According to one model, RNA polymerase molecules situated downstream of promoters are held in the paused state by bound inhibitory factors (e.g., DSIF and NELF). Inhibition is then relieved as 1) these factors are phosphorylated by stimulatory kinases (e.g., P-TEFb) and 2) elongation factors (e.g., ELL) are recruited to the polymerase (as in Figure 11.20b). These studies have suggested that some transcription factors (e.g., Myc) may stimulate the transcription of some genes by acting at the level of transcription elongation as well as that of transcription initiation. Because it doesn’t require the assembly of the transcriptional machinery at the promoter, the induced release of paused polymerases may facilitate the rapid activation of genes in response to developmental or environmental signals.

scriptional repression. HDACs are present as subunits of larger complexes described as corepressors. Corepressors are similar to coactivators, except that they are recruited to specific genetic loci by transcriptional factors (repressors) that cause the targeted gene to be silenced rather than activated (Figure 12.53). The progression of several types of cancer may depend on the tumor cells being able to repress the activity of certain genes. A number of anti-cancer drugs (e.g., Zolinza) are currently being tested that act by inhibiting HDAC enzymes. Recent studies indicate that the removal of acetyl groups from histone tails is accompanied by another histone modification: methylation of the lysine residue at position #9 of histone H3 molecules. You might recall that this modification, H3K9me, was discussed at some length on page 501 as a key event in the formation of heterochromatin. It now appears

Active chromatin

Binding site for repressor (silencer)

HDAC (e.g. Sin3 complex) Deacetylation

Transcriptional repressor

Corepressor e.g. SMRT/ N-CoR or CoREST

Inactive chromatin

Deacetylated histones

Histone methyltransferase (e.g. SUV39H1) H3-K9 methylation

Methyl group

Transcriptional Repression As is evident from Figures 12.2 and 12.3, control of transcription in bacteria relies heavily on repressors that block transcription. Although research in eukaryotes has focused primarily on factors that activate or enhance transcription of specific genes, eukaryotic cells also possess negative regulatory mechanisms. We’ve seen that transcriptional activation is associated with changes in the state and/or position of nucleosomes in a particular region of chromatin. The state of acetylation of chromatin is a dynamic property; just as there are enzymes (HATs) to add acetyl groups, there are also enzymes to remove them. Removal of acetyl groups is accomplished by histone deacetylases (HDACs). Whereas HATs are associated with transcriptional activation, HDACs are associated with tran-

Acetylated histones

Transcriptional repressor

Corepressor e.g. SMRT/ N-Cor or CoREST

Deacetylated histones

Figure 12.53 A model for transcriptional repression. Histone tails in the promoter regions of active chromatin are usually heavily acetylated. When a transcriptional repressor binds to its DNA binding site, it recruits a corepressor complex (e.g., SMRT/N-CoR or CoREST) and an associated HDAC activity. The HDAC removes acetyl groups from the histone tails. A separate protein containing histone methyltransferase activity adds methyl groups to the K9 residue of H3 histone tails. Together, the loss of acetyl groups and addition of methyl groups lead to chromatin inactivation and gene silencing.

531

DNA Methylation One of the key factors in silencing a region of the genome involves a phenomenon known as DNA methylation. Examination of the DNA of mammals and other vertebrates indicates that as many as 1 out of 100 nucleotides bears an added methyl group, which is always attached to carbon 5 of a cytosine. Methyl groups are added to the DNA by a family of enzymes called DNA methyltransferases encoded in humans by DNMT genes. This simple chemical modification is thought to serve as an epigenetic mark or “tag” that allows certain regions of the DNA to be identified and utilized differently from other regions. In mammals, the methylcytosine residues are part of a 5-CpG-3 dinucleotide within a symmetrical sequence, such as CCGG GGCC GCGC CGCG ACGT TGCA

in which the red dots indicate the positions of the methyl groups.8 As a true epigenetic mark, the pattern of DNA methylation must be maintained through repeated cell divisions. This is accomplished by an enzyme, Dnmt1, that travels with the replication fork and methylates the daughter DNA strands by copying the methylation pattern of the parental strands. The majority of methylcytosine residues in mammalian DNA are located within noncoding, repeated sequences, primarily transposable elements (page 410). Methylation is thought to maintain these elements in an inactive state. Organisms that harbor mutant DNMTs tend to exhibit a marked increase in transposition activity, which can be detrimental to the health of the organism. As discussed on page 460, piRNAs serve as mediators of transposable element suppression in germ cells. During the formation of male gametes (i.e., during spermatogenesis), piRNAs act by guiding the DNA-methylation machinery to sites where transposable elements reside within the genome.

8

An exception is found in embryonic stem cells, where approximately onefourth of methylations occur in a non-CpG context, including CpA and CpT. At some sites, methylcytosine can be enzymatically converted to hydroxymethylcytosine, which may represent either an intermediate in the DNA demethylation process or an alternate epigenetic mark.

Demethylation

De novo methylation

Maintenance of methylation

Embryo

Gamete

Gamete

Zygote

Adult somatic cells Gamete

Gamete

Blastocyst Primordial germ cells Cleavage

Implantation

Gastrulation

Gametogenesis

Developmental Time

Figure 12.54 Changes in DNA methylation levels during mammalian development. The DNA of the fertilized egg (zygote) is substantially methylated. During cleavage, the genome undergoes global demethylation. Interestingly, DNA inherited from one’s father undergoes demethylation at an earlier stage and by a different mechanism than DNA inherited from one’s mother. After implantation, the DNA is subjected to new (de novo) methylation, which is maintained in the somatic cells at a high level throughout the remainder of development and adulthood. In contrast, the DNA of primordial germ cells, which give rise to the gametes in the adult, is subsequently demethylated. The DNA of the germ cells then becomes remethylated at later stages of gamete formation. (BASED ON A FIGURE FROM R. JAENISCH, TRENDS GENETICS 13-325, 1997; COPYRIGHT 1997. TRENDS IN GENETICS BY ELSEVIER LTD. REPRODUCED WITH PERMISSION OF ELSEVIER LTD. IN THE FORMAT REUSE IN A BOOK/ TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

12.4 Transcriptional Control

DNA Methylation and Transcriptional Repression In addition to the general suppression of transposable elements, DNA methylation has long been implicated in the repression of transcription of specific genes. In recent years, techniques have been developed to determine the specific cytosine

residues that are methylated in any given population of cells across the entire genome. These studies confirm that (1) the promoter regions of inactive genes tend to be more heavily methylated than the promoter regions of active genes and (2) DNA methylation patterns vary from one cell type to another, reflecting the differential activity of genes among various tissues. Most evidence suggests that methylation of a gene’s promoter serves more to maintain that gene in an inactive state than as a mechanism for initial inactivation. As an example, inactivation of the genes on one of the X chromosomes of female mammals (page 498) occurs prior to a wave of methylation of gene promoters that is thought to convert the DNA into a more permanently repressed condition. How does a given cell determine which promoters should become methylated, and how does methylation of these promoters enforce the transcriptionally repressed state of that gene? DNA methylation has been closely linked with another repressive epigenetic mark—histone methylation. It may be that gene inactivation begins with the establishment of a transcriptionally repressive pattern of histone modifications in the core histones of promoter regions and that these modified histone tails then recruit the DNA methylation machinery to those nucleosomes. Once the DNA in these regions is methylated, the methylated cytosine residues can serve as binding sites to recruit additional histone modifying enzymes that further repress and compact the chromatin of that promoter (as in Figure 12.53). Although DNA methylation is a relatively stable epigenetic mark, the transmission of these marks from a parental cell to its daughters is subject to regulation. The remarkable shifts in DNA-methylation levels that occur during the life of a mammal are depicted in Figure 12.54. The first major

Methylation level

that this same modification is also involved in the more dynamic types of transcriptional repression that occur within euchromatic regions of the genome. Figure 12.53 suggests one of numerous possible models of transcriptional repression that incorporates several aspects of chromatin modification.

532

Chapter 12 Control of Gene Expression

change in the level of methylation occurs between fertilization and the first few divisions of the zygote, when the DNA loses the methylation “tags” that were inherited from the previous generation. Then, at about the time the embryo implants into the uterus, a wave of new (or de novo) methylation spreads through the cells, establishing a new pattern of methylation throughout the DNA. We don’t know the signals that determine whether a given gene in a particular cell is targeted for methylation or spared at this time. It is evident, however, that abnormal DNA methylation patterns are often associated with disease. For example, the development of tumors often depends on aberrant methylation and subsequent silencing of genes whose expression would normally suppress tumor growth. DNA methylation is not a universal mechanism for inactivating eukaryotic genes. DNA methylation has not, for example, been found in yeast or nematodes. Plant DNA, in contrast, is often heavily methylated, and studies on cultured plant cells indicate that, as in animals, DNA methylation is associated with gene inactivation. In one experiment, plants treated with compounds that interfere with DNA methylation produced a greatly increased number of leaves and flower stalks. Moreover, flowers that developed on these stalks had a markedly altered morphology. One of the most dramatic examples of the role of DNA methylation in silencing gene expression occurs as part of an epigenetic phenomenon known as genomic imprinting, which is unique to mammals. Genomic Imprinting It had been assumed until the mid1980s that the set of chromosomes inherited from a male parent was functionally equivalent to the corresponding set of chromosomes inherited from the female parent. But, as with many other long-standing assumptions, this proved not to be the case. Instead, certain genes are either active or inactive during early mammalian development depending solely on whether they were brought into the zygote by the sperm or the egg. For example, the gene that encodes a fetal growth factor called IGF2 is only active on the chromosome transmitted from the male parent. In contrast, the gene that encodes a specific potassium channel (KVLQT1) is only active on the chromosome transmitted from the female parent. Genes of this type are said to be imprinted according to their parental origin. Imprinting can be considered an epigenetic phenomenon (page 509), because the differences between alleles are inherited from one’s parents but are not based on differences in DNA sequence. It is estimated, based largely on the study of mutant mice, that the mammalian genome contains at least 80 imprinted genes located primarily in several distinct chromosomal clusters. Genes are thought to become imprinted as the result of selective DNA methylation of certain regions that control the expression of either the male or female alleles. As a result, the maternal and paternal versions of imprinted genes differ consistently in their degree of methylation. Furthermore, mice that lack a key DNA methyltransferase (Dnmt1) are unable to maintain the imprinted state of the genes they inherit. The methylation state of imprinted genes is not affected by the waves of demethylation and remethylation that sweep

through the early embryo (Figure 12.54). Consequently, the same alleles that are inactive due to imprinting in the fertilized egg will be inactive in the cells of the fetus and most adult tissues. The major exception occurs in the germ cells, where the imprints inherited from the parents are erased during early development and then reestablished when that individual begins to produce his or her own gametes. Some mechanism must exist whereby specific genes (e.g., KVLQT1) are selected for inactivation during sperm formation, whereas other genes (e.g., IGF2) are selected for inactivation during formation of the egg. Each of the imprinted gene clusters produces at least one noncoding RNA. These RNAs play a key role in directing the silencing of nearby genes in several of the clusters. Disturbances in imprinting patterns have been implicated in a number of rare human genetic disorders, particularly those involving a cluster of imprinted genes residing on chromosome 15. Prader-Willi syndrome (PWS) is an inherited neurological disorder characterized by mental retardation, obesity, and underdevelopment of the gonads. The disorder often occurs when chromosome 15 inherited from the father carries a deletion in a small region containing the imprinted genes. Because the paternal chromosome carries a deletion of one or more genes and the maternal chromosome carries the inactive, imprinted version of the homologous region, the individual lacks a functional copy of the gene(s). Although genes are usually imprinted for the life of an individual, cases are known where imprinting can be lost. In fact, loss of imprinting of the IGF2 gene occurs in about 10 percent of the population, leading to increased production of the encoded growth factor. Individuals with this epigenetic alteration are at increased risk of developing colorectal cancer. One case has been uncovered of a woman who, because of a presumed deficiency in a DNA methylating enzyme, produced oocytes totally lacking imprinted genes. When fertilized, these oocytes failed to develop past implantation, demonstrating the essential nature of this epigenetic contribution. What possible role could genomic imprinting play in the development of an embryo? Although there are several different thoughts on this question, there is no definitive answer. According to one researcher, genomic imprinting is a “phenomenon in search of a reason,” which is where we will leave the matter. Long Noncoding RNAs (lncRNAs) as Transcriptional Repressors As discussed on page 461, a large fraction of the mammalian genome is transcribed into RNAs, including thousands of species that are large enough to be described as long noncoding RNAs, or lncRNAs. The functions of a handful of lncRNAs have been well studied, including several species involved in genomic imprinting or in X-chromosome inactivation. These studies suggest that lncRNAs serve as sequence-specific molecules that can guide protein complexes to specific sites in the chromatin. Like Xist (page 499), most lncRNAs appear to play a role in orchestrating transcriptional repression, although some lncRNAs may instead function in transcriptional activation at some loci. An example of lncRNA-mediated gene repression is seen in the expression of human HOX genes, whose encoded

533 HOTAIR Methyl group removed from H3K4

3'

Methyl group added to H3K27

CoREST corepressor

HOTAIR (lnc RNA) Methyl groups on K4 of H3

RNA polymerase HOXC Locus

5'

H3K27 methylated

PRC2 H3K27 histone methyltransferase

LSD1 H3K4 demethylase

HOXD Locus

HOXD Locus (repressed)

2

3

1

Figure 12.55 An lncRNA acting as a mediator of transcriptional repression. In step 1, the lncRNA HOTAIR is being transcribed from a portion of the HOXC locus located on human chromosome 12. In step 2, the HOTAIR RNA has adopted a secondary structure that allows it to interact with specific protein complexes that will act on the HOXD locus located on human chromosome 2. The 3 end of the lncRNA binds specifically to the CoREST corepressor complex, which is associated with the enzyme LSD1, which removes methyl groups

from H3K4 residues of the nucleosomes, resulting in the removal of a transcriptionally active epigenetic mark. Meanwhile, the 5 end of HOTAIR is bound specifically to the PRC2 complex (a Polycomb Group protein complex) that has an enzyme subunit that adds methyl groups to H3K27 residues, which is a transcriptionally repressive epigenetic mark. Because of the demethylation of H3K4 and the methylation of H3K27, the HOXD locus is transcriptionally repressed and has adopted a compact conformation (step 3).

proteins play a key role in determining the anterior-posterior axis of the early embryo. The 5 end of the lncRNA HOTAIR (which is transcribed from the HOXC locus in the human genome) binds the PRC2 complex, while the 3 end of this lncRNA binds the CoREST complex. PRC2 contains a histone methyltransferase that is specific for the K27 residue on the tails of H3 core histones. Methylation of H3K27 is a repressive modification that maintains the transcriptionally inactive state of the chromatin of those genes to which it has been added. CoREST is a corepressor complex that was shown in Figure 12.53. In that situation, CoREST acted as a corepressor through its association with a histone deacetylase. In the present case shown in Figure 12.55, CoREST is associated with a histone demethylase that removes methyl groups from K4 residues of histone H3. As indicated in Figure 12.49, methylated H3K4 is a mark of transcriptionally active genes, and, consequently, removal of the added methyl groups leads to repression of transcription. HOTAIR guides these two repressive protein complexes to another locus (HOXD) situated on a different chromosome. While they are tethered to the HOXD target locus, PRC2 and CoREST modify the adjacent chromatin and inhibit its transcription. This is not an isolated instance: genome-wide ChIP-chip analysis (page 524) indicates that more than 700 genes in human fibroblasts are occupied by both PRC2 and CoREST, with HOTAIR providing the link between the two complexes. It is proposed that lncRNAs can act as scaffolds to hold protein complexes in close association to specific target sites in the genome, where they can carry out their chromatinmodifying functions. When the expression of large numbers of lncRNAs is selectively blocked, the gene-expression patterns of the affected cells can be dramatically altered, suggesting that these RNA molecules can play a widespread role in the regulation of gene expression. These gene regulators can have important biological roles to play. For example, when transcription of

certain lncRNAs is blocked in embryonic stem cells, the cells lose their pluripotent state (page 21) and express genes characteristic of specific lineages that would normally be repressed.

REVIEW

12.5 | RNA Processing Control Once an RNA is transcribed, it typically must undergo a number of processing events before it can function. This is particularly true for mRNAs, which must have a 5 cap structure added (see Figure 11.21), be correctly spliced, and have

12.5 RNA Processing Control

1. How are the functions of a bacterial lac repressor and a mammalian glucocorticoid receptor similar? How are they different? 2. What types of regulatory sequences are found in the regulatory regions of the DNA upstream from a gene, such as that which encodes for PEPCK? What is the role of these various sequences in controlling the expression of the nearby gene(s)? 3. What is meant by the term epigenetic? How is it that such diverse phenomena as histone methylation, DNA methylation, and centromere determination can all be described as epigenetic? 4. What are some of the properties that tend to be found in several groups of transcription factors? 5. What is the difference between a transcriptional activator and a transcriptional repressor? between a coactivator and a corepressor? between an HAT and an HDAC? 6. How does methylation of DNA affect gene expression? How is it related to histone acetylation or histone methylation? What is meant by genomic imprinting?

534 Primary transcript 5'

3'

EIIIB

EIIIA

Fibroblast mRNA

Liver mRNA

Figure 12.56 Alternative splicing of the fibronectin gene. The fibronectin gene consists of a number of exons shown in the top drawing (the introns shown in black are not drawn to scale). Two of these exons encode portions of the polypeptide called EIIIA and EIIIB, which are included in the protein produced by fibroblasts, but which are excluded from the protein produced in the liver. The difference is due to alternative splicing; those portions of the pre-mRNA that encode these two exons are excised from the transcript in liver cells. The sites of the missing exons are indicated by the arrows in the liver mRNA.

an intact poly(A) tail at the 3 end. Each of these events must be properly regulated to enable an mRNA to exit the nucleus and serve as a template for protein assembly. The genes of complex plants and animals contain numerous introns and exons, and the introns must be precisely removed to allow transport through the nuclear pore. However, the pattern of intron removal can be subject to dramatic regulation, allowing for multiple protein products from the same gene. The particular splicing pathway that is followed may depend on the particular stage of development, or the particular cell type or

Exon 4 12 Alternatives

Exon 6 48 Alternatives

tissue being considered. In the simplest case, a specific exon can either be retained or spliced out of the transcript. An example of this type of alternative splicing occurs during synthesis of fibronectin, a protein found in both blood plasma and the extracellular matrix (Figure 12.56). Fibronectin produced by fibroblasts and retained in the matrix is encoded by an mRNA that contains two extra exons compared to the version of the protein produced by liver cells and secreted into the blood. The extra peptides are encoded by portions of the premRNA that are retained during processing in the fibroblast but are removed during processing in the liver cell. A more complicated pattern is observed in the Dscam gene, which encodes a family of cell-adhesion molecules that are expressed on the cell surface and function in axon guidance during early neuronal development in fruit flies. A homologue has been implicated in features of Down syndrome in humans. In fruit flies, this gene consists of 24 exons but includes multiple alternative versions of exons 4, 6, 9 and 17 (Figure 12.57). Remarkably, as a result of the combinatorial patterns that can be generated by selecting one of the variable exons, a single Dscam gene can encode up to 38,106 different cell-adhesion molecules. The fruit fly genome only contains about 14,000 genes, illustrating how alternative splicing can enable a relatively small genome to encode a much larger proteome. The regulation of alternative splicing that allows one neuron to select a single Dscam isoform and not another is complex and an important goal of ongoing research for all alternative splicing events. The mechanism by which a particular exon is included or excluded depends primarily on whether or not specific 3 and 5 splice sites are selected by the splicing machinery as sites to be cleaved (page 450). Many factors can influence splice site selection. Some splice sites are described as “weak,” which indicates that they can be bypassed by the splicing machinery under certain conditions. The

Exon 9 33 Alternatives

Exon 17 2 Alternatives

Chapter 12 Control of Gene Expression

Genomic DNA and Pre-mRNA

mRNA Immunoglobulin (Ig) domains

Transmembrane domain

Protein

Figure 12.57 A more complex example of alternative splicing. Most eukaryotic genes are subject to alternative splicing. The Drosophila Dscam gene illustrates the diversity of proteins that can be generated by alternative splicing of transcripts from a single gene.The top line shows the organization of the pre-mRNA and genomic DNA. The gene contains 24 exons but a given primary transcript only contains one of the multiple possibilities from each of the exon 4, 6, 9, and 17 clusters. The middle line shows the mature in mRNA with the selected exon from each of these four clusters depicted in a different color. The bottom line

shows the domain structure of the encoded protein. Exons 4 and 6 encode portions of two of the Ig domains of the protein, exon 9 encodes a different Ig domain, and exon 17 encodes the protein’s transmembrane domain. Combining all the possibilities, the Dscam gene can encode 38,106 possible mRNAs. For comparison, the entire fruit fly genome only contains about 14,000 genes. (D. L. BLACK, CELL 103:368; 2000, FIG. 1; CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

535

U1

U1

U2

GU

GU

A (Py)n AG

U1

U1

GU

GU

Figure 12.58 Mechanisms of alternative splicing. (a) Changes in the sequence of a 5 splice site can affect pairing with U1 snRNA. This can affect the kinetics of splicing by allowing those sites with better matches to U1 to recruit the spliceosome to one 5 splice site over another. In this figure, two potential 5 splice sites are present, as indicated by the two GU dinucleotides. The black line indicates the segments of the transcript that will be ligated after excision of the intervening section. In the top drawing, the splicing machinery has recognized the second of the two potential 5 splice sites. In the middle drawing, a change in sequence (indicated by the black rectangle) has occurred in the region of the second potential 5 splice site, and the splicing machinery now recognizes the first splice site. In the bottom drawing, a second change of sequence has been introduced in the region of the first potential 5 splice site, causing the splicing machinery to ignore this site and use the other site as the 5 splice site. (b) Different-strength splice sites can also be repressed or activated, depending on nearby binding of proteins that tend to either activate splice sites (SR proteins) or repress splice sites (hnRNP proteins). The SR proteins bind to specific sites within either exons or introns called exon and intron splicing enhancers (ESEs and ISEs) shown in light blue. The hnRNP proteins bind to other sites in exons or introns called exon and intron splicing silencers (ESSs and ISSs) shown in light red. Binding of these proteins can regulate splice site selection by determining whether or not splicing components such as U2AF, U1 snRNP, and U2 snRNP bind to a particular site on the pre-mRNA. (B. HARTMANN AND J. VALCARCEL, CURR OPIN CELL BIOL 21:377–386, 2009, FIG. 2. CURRENT OPINION IN CELL BIOLOGY BY ELSEVIER LTD. REPRODUCED WITH PERMISSION OF ELSEVIER LTD. IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

U2AF

U1

U2 U2AF

A (Py)n AG

U2

U1

GU

U2AF

GU

A (Py)n AG

(a)

SR proteins

U5

hnRNPs

U4

U6 U2

U2AF

(Py)n

U2AF

U1

AG

GU ESE ESS

A (Py)n

AG

ISE ISS

(b)

recognition and use of weak splice sites are governed by sequences in the RNA, including exonic splicing enhancers (page 453) that are located within the exons whose inclusion is regulated. Exonic splicing enhancers serve as binding sites for specific regulatory proteins. If a particular regulatory protein is present and active in a given cell, that protein can bind to the splicing enhancer and recruit the necessary splicing factors to a nearby weak 3 or 5 splice site. Use of these splice sites results in the inclusion of the exon into the mRNA. A model of how this may work is depicted in Figure 12.58. If the regulatory protein is not present and active in the cell, the neighboring splice sites are not recognized and the exon is excised along with the flanking introns.

REVIEW 1. How is it that alternative splicing can effectively increase the number of genes in the genome? 2. Describe an example of alternative splicing? What is the value to a cell of this type of control? How might a cell regulate the sites on the pre-mRNA that are chosen for splicing? 3. What is RNA editing, and how can it increase the number of proteins that can be formed from a single pre-mRNA transcript?

12.5 RNA Processing Control

RNA Editing Another way in which gene expression can be regulated at the posttranscriptional level is by RNA editing, in which specific nucleotides are converted to other nucleotides after the RNA has been transcribed. RNA editing can create new splice sites, generate stop codons, or lead to amino acid substitutions. Although not nearly as widespread as alternative splicing, RNA editing is particularly important in the nervous system, where a significant number of messages appear to have one or more adenines (A) converted to inosines (I). This modification involves the enzymatic removal of an amino group from the nucleotide. I is subsequently read as a G by the translational machinery. The glutamate receptor,

which mediates excitatory synaptic transmission in the brain (page 170), is a product of RNA editing. In this case, an A-toI modification generates a glutamate receptor whose internal channel is impermeable to Ca2 ions. Genetically engineered mice that are unable to carry out this specific RNA-editing step develop severe epileptic seizures and die within weeks after birth. Another important example of RNA editing affects the cholesterol-carrying protein apolipoprotein B. The LDL complexes discussed on page 313 are produced in the liver and contain the protein apolipoprotein B-100, which is translated from a full-length mRNA approximately 14,000 nucleotides long. In the intestine, the cytidine at nucleotide residue 6666 in the RNA is converted enzymatically to a uridine, which generates a stop codon (UAA) that terminates translation. The shortened version of the protein, apolipoprotein B-48, is produced only in cells of the small intestine where it plays an essential role in the absorption of fats.

536

12.6 | Translational Control Translational control encompasses a wide variety of regulatory mechanisms affecting the translation of mRNAs previously transported from the nucleus to the cytoplasm. Subjects considered under this general regulatory umbrella include (1) localization of mRNAs to certain sites within a cell; (2) whether or not an mRNA is translated and, if so, how often; and (3) the half-life of the mRNA, a property that determines how long the message is translated. Translational control mechanisms generally operate through interactions between specific mRNAs and various proteins and microRNAs present within the cytoplasm. It was noted on page 444 that mRNAs contain noncoding regions, called untranslated regions (UTRs), at both their 5 and 3 ends. The 5 UTR extends from the 5 cap to the AUG initiation codon, whereas the 3 UTR extends from the termination codon to the end of the RNA transcript (see Figure 11.21). For many years the untranslated regions of the message were largely ignored, but it has become evident that the UTRs contain nucleotide sequences used by the cell to mediate translationalcontrol. In eukaryotic cells, a number of important biological processes depend on translation of stored mRNAs that are not translated immediately upon entry to the cytoplasm. As an example, many mRNAs are stored in unfertilized eggs that must remain inactive until fertilization and subsequent development. The initiation of translation of these mRNAs during early development involves at least two distinct events: removal of bound inhibitory proteins and increase in the length of the poly(A) tails by action of an enzyme residing in the egg cytoplasm. These events are illustrated in the model of translational activation in Xenopus embryos in Figure 12.59 and serve to emphasize the fact that the 5 and 3 ends of mRNAs often communicate by protein-protein interaction to regulate translation.

Initiation of Translation Prokaryotic cells have polycistronic mRNAs that encode numerous polypeptides, whereas eukaryotic mRNAs are preCoding region

5' UTR

dominately monocistronic, encoding only a single polypeptide. In both cases, translation starts at an AUG codon, but how the ribosome finds this start codon is entirely different and provides a means to differentially regulate overall protein production. We will start with eukaryotic translation initiation since it is fairly simple—the first AUG downstream of the 7-methylguanosine cap is almost always the start codon. The small subunit of the ribosome assembles at the cap structure and scans down until the first AUG is found after which the large subunit of the ribosome joins and translation begins (see Figure 11.47). There are a few examples where an internal ribosome entry site (IRES) allows the ribosome to start at an AUG other than the first one downstream of the cap, but the overwhelming majority of mRNAs are structured such that translation begins at the first AUG downstream of the cap. A number of mechanisms have been discovered that regulate the rate of translation of mRNAs in response to changing cellular requirements. Some of these mechanisms can be considered to act globally because they affect translation of all messages. When a human cell is subjected to certain stressful stimuli, a protein kinase is activated that phosphorylates the initiation factor eIF2, which blocks further protein synthesis. As discussed on page 469, eIF2-GTP delivers the initiator tRNA to the small ribosomal subunit, after which it is converted to eIF2-GDP and released. The phosphorylated version of eIF2 cannot exchange its GDP for GTP, which is required for eIF2 to become engaged in another round of initiation of translation. It is interesting to note that four different protein kinases have been identified that phosphorylate the same Ser residue of the eIF2 subunit to trigger translational inhibition. Each of these kinases becomes activated after a different type of cellular stress, including heat shock, viral infection, the presence of unfolded proteins, or amino acid starvation. Thus at least four different stress pathways converge to induce the same response. Other mechanisms influence the rate of translation of specific mRNAs through the action of proteins that recognize specific elements in the UTRs of those mRNAs. One of the best studied examples involves the mRNA that encodes the protein ferritin. Ferritin sequesters iron atoms in the cyto-

eIF4E

eIF4E 4OS

Chapter 12 Control of Gene Expression

–Cap

eIF3 Maskin 3' UTR

CPEB

UUUUAU AAUAAA–A

Figure 12.59 A model for the mechanism of translational activation of mRNAs following fertilization of a Xenopus egg. Maternally contributed messenger RNAs in the egg are maintained in the cytoplasm in an inactive state by a combination of their short poly(A) tails and a bound inhibitory protein called Maskin. Maskin is tethered at one surface to CPEB, a protein that binds to sequences in the 3 UTR of specific mRNAs, and at another surface to the cap-binding protein eIF4E. Following fertilization, CPEB is phosphorylated, which displaces Maskin. The phosphorylated version of CPEB recruits another protein CPSF, which recruits poly(A) polymerase (PAP), an enzyme that adds adenosine

–Cap eIF4G

Maskin P PABP PABP PAP UUUUAU AAUAAA–AAAAAAAAAAAAAAAAAAA CPEB

CPSF

residues to the poly(A) tail. The elongated poly(A) tail serves as a binding site for PABP molecules, which help to recruit eIF4G, an initiation factor required for translation. As a result of these changes, the mRNA is actively translated. (FROM R. D. MENDEZ AND J. D. RICHTER, NATURE REVIEWS MOL. CEL BIOL. 2:514, 2001, COPYRIGHT 2001, NATURE REVIEWS MOLECULAR CELL BIOLOGY BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

537 Iron-response element (IRE) Iron regulatory protein (IRP) (active state) 5'

AAAA

– Iron (translation inhibited)

3'

+ Iron (translation stimulated)

IRP (inactive state)

Iron (in form of an iron-sulfur cluster) Protein-coding region

5'

initiation of translation. For some mRNAs, the ShineDalgarno sequence is perfectly complementary to 16S rRNA, and therefore initiation is highly efficient. For other mRNAs, the Shine-Dalgarno sequence may not be exactly complementary to 16S rRNA, so initiation of translation will be less efficient. In this way, the amounts of protein produced from each open reading frame within a polycistronic mRNA can be regulated. For example, perhaps enzyme A in the trp operon is needed at much higher levels than the other four. A perfect Shine-Dalgarno sequence upstream of the start codon for enzyme A and imperfect pairing for the other four would allow more efficient initiation for this open reading frame compared to the others.

Cytoplasmic Localization of mRNAs AAAA

3'

Ferritin mRNA AUG (initiation codon)

Figure 12.60 5 UTR control of ferritin mRNA translation. When iron concentrations are low, an iron-binding repressor protein, called the iron regulatory protein (IRP), binds to a specific sequence in the 5 UTR of the ferritin mRNA, called the iron-response element (IRE), which is folded into a hairpin loop. When iron becomes available, it binds to the IRP, changing its conformation and causing it to dissociate from the IRE, allowing translation of the mRNA to form ferritin.

12.6 Translational Control

plasm of cells, thereby protecting the cells from the toxic effects of excess free metal. The translation of ferritin mRNAs is regulated by a specific repressor, called the iron regulatory protein (IRP), whose activity depends on the concentration of unbound iron in the cell. At low iron concentrations, IRP binds to a specific sequence in the 5 UTR of the mRNA called the iron-response element (IRE) (Figure 12.60). The bound IRP physically interferes with the binding of a ribosome to the 5 end of the mRNA, thereby inhibiting the initiation of translation. At high iron concentrations, the IRP is modified such that it loses its affinity for the IRE. Dissociation of the IRP from the ferritin mRNA gives the translational machinery access to the mRNA, and the encoded protein is synthesized. For prokaryotes, the process of translation initiation is more complex because there are multiple start codons in a polycistronic message and ribosomes must find the correct start sites. From a regulation standpoint, one would predict that all the open reading frames in a polycistronic mRNA should be translated at equal levels. However, this is not the case. The mechanism used to identify bona fide start codons is the presence of a Shine-Dalgarno sequence just upstream of start codons. As discussed on page 468, this sequence is complementary to a region in 16S rRNA within the small subunit of the ribosome. Base pairing between the Shine-Dalgarno sequence and 16S rRNA places the small subunit just upstream of a start codon allowing entry of the large subunit and

In eukaryotic cells, the localization of specific mRNAs to specific cytoplasmic regions is a widely utilized mechanism by which cells establish functionally distinct cytoplasmic domains. Recent innovations in live-cell imaging techniques are allowing researchers to follow the movements of specific mRNAs out of the nucleus and through the cytoplasm. We will briefly consider the fruit fly, whose egg, larval, and adult stages are illustrated in Figure 12.61a. The development of the anterior–posterior (head–abdomen) axis of a larval fly and subsequent adult is foreshadowed by localization of specific mRNAs along this same axis in the oocyte. For example, mRNAs transcribed during oogenesis from the bicoid gene become preferentially localized at the anterior end of the oocyte, while mRNAs transcribed from the oskar gene become localized at the opposite end (Figure 12.61b,c). The mRNAs are subsequently translated at the site of localization where the newly synthesized protein accumulates. The protein encoded by bicoid mRNA plays a critical role in the development of the head and thorax, whereas the protein encoded by oskar mRNA is required for the formation of germ cells, which develop at the posterior end of the larva. The information that governs the cytoplasmic localization of these mRNAs resides in the 3 UTR. This can be demonstrated using fruit flies that carry a foreign gene whose coding region is fused to a DNA sequence containing the 3 UTR of either the bicoid or oskar mRNA. When the foreign gene is transcribed during oogenesis, the mRNA becomes localized in the site determined by the 3 UTR. Localization of mRNAs is mediated by RNA-binding proteins that recognize localization sequences (called zip codes) in this region of the mRNA. Microtubules, and the motor proteins that use them as tracks, play a key role in transporting mRNA-containing particles to specific locations. The localization of oskar mRNAs in a fruit fly oocyte, for example, is disrupted by agents such as colchicine that depolymerize microtubules and by mutations that alter the activity of the kinesin I motor protein. Microfilaments, on the other hand, are thought to anchor mRNAs after they have arrived at their destination. The localization of mRNAs is not restricted to eggs and oocytes but occurs in all types of polarized cells. For instance, actin mRNAs are localized near the leading edge of a migrating fibroblast, which is the site where actin molecules are needed for locomotion

538 T1 T2 T3 A1 A2 A3 A4 A5 A6 A7 A8

Egg

A5A4A3 T3 T1 A2 A1 T2

Nucleus

A6 A7 A8

Leading edge Adult

Larva

(a)

(d)

(b)

(c)

(Figure 12.61d ). During the process of localization, translation of the mRNAs is specifically inhibited by associated proteins.

Chapter 12 Control of Gene Expression

The Control of mRNA Stability The longer an mRNA is present in a cell, the more times it can serve as a template for synthesis of a polypeptide. If a cell is to control gene expression, it is just as important to regulate the survival of an mRNA as it is to regulate the synthesis of that mRNA in the first place. Unlike prokaryotic mRNAs, which begin to be degraded at their 5 end even before their 3 end has been completed, most eukaryotic mRNAs are relatively long-lived. Even so, the half-life of eukaryotic mRNAs is quite variable. The FOS mRNA, for example, which is involved in the control of cell division, is rapidly degraded (halflife of 10 to 30 minutes). In contrast, mRNAs that encode the production of the dominant proteins of a particular cell, such as hemoglobin in an erythrocyte precursor or ovalbumin in a cell of a hen’s oviduct, typically have half-lives of more than 24 hours. Thus, as with mRNA localization or the rate of initiation of mRNA translation, specific mRNAs can be recognized by the cell’s regulatory machinery and given differential treatment.

Figure 12.61 Cytoplasmic localization of mRNAs. (a) Schematic drawings showing three stages in the life of a fruit fly: the egg, larva, and adult. Segments of the thorax and abdomen are indicated. (b) Localization of bicoid mRNA at the anterior pole of an early cleavage stage of a fly embryo by in situ hybridization. (c) Localization of oskar mRNA at the posterior pole of a comparable stage to that shown in b. Both of these localized RNAs play an important role in the development of the anterior–posterior axis of the fruit fly. (d) Localization of -actin mRNA (red) near the leading edge of a migrating fibroblast. This is the region of the cell where actin is utilized during locomotion (see Figure 9.71). (B: COURTESY OF DANIEL ST JOHNSTON; C: COURTESY OF ANTOINE GUICHET AND ANNE EPHRUSSI, EMBL, HEIDELBERG, GERMANY; D: FROM V. M. LATHAM ET AL., COURTESY ROBERT H. SINGER, CURR. BIOL. 11:1010, 2001, FIG 4D, WITH PERMISSION FROM ELSEVIER.)

Unless protected by mechanisms such as those used in unfertilized eggs (page 536), mRNAs with short or absent poly(A) tails are rapidly degraded. This suggests that the longevity of an mRNA is related to the length of its poly(A) tail. When a typical mRNA leaves the nucleus, it contains a tail of approximately 200 adenosine residues (Figure 12.62a, step 1). As an mRNA remains in the cytoplasm, its poly(A) tail tends to be gradually reduced in length as it is nibbled away by a type of exonuclease known as a deadenylase. No dramatic effect on mRNA stability is observed until the tail becomes shortened to approximately 30 residues (step 2). Once the tail is shortened to this length, the mRNA is usually degraded rapidly by either of two pathways. In one of these pathways (shown in Figure 12.62a), degradation of the mRNA begins at its 5 end following removal of the remaining poly(A) at the 3 end of the message. The fact that the poly(A) tail at the 3 end of the message protects the cap at the 5 end of the molecule suggests that the two ends of the mRNA are held in close proximity (see Figure 11.47). Once the 3 tail is removed (step 3, Figure 12.62a), the message is decapped (step 4) and degraded from the 5 end toward the 3 end (step 5). Deadenylation, decapping, and 5 → 3 degradation occur within small transient cytoplasmic granules called P-bodies (Figure 12.62c). In addition to destroying “un-

539 1

m7Gppp

AAAAAAAAAAA200

2

m7Gppp

AAAAA30 Deadenylase

m7Gppp

3

A0

Decapping enzyme

m7Gppp

4

A0

A0

5

5'

3' exonuclease

(a)

(c)

3a

m7Gppp

4a

m7Gppp

A0

Exosome

(b)

ity they produce) can be appreciated by considering the normally short-lived FOS mRNA mentioned above. If the destabilizing sequence from the FOS gene is lost through a deletion, the half-life of FOS mRNA increases, and the cells often become malignant. Destabilizing sequences in the 3 UTR are thought to serve as binding sites both for proteins (e.g., AUF1) and microRNAs that bring about the deadenylation and subsequent destruction of the mRNA, as discussed below.

The Role of MicroRNAs in Translational Control Proteins are not the only molecules that can act as regulators of mRNA translation and stability. The formation and mechanism of action of microRNAs was discussed in Chapter 11 (page 459). Like most of the proteins we have been discussing that regulate mRNA translation and stability, miRNAs act primarily by binding to sites in the 3 UTR of their target mRNAs. It is becoming increasingly apparent that miRNAs are important regulators of virtually every biological process. Even the earliest stages of embryonic development require the involvement of miRNAs, as evidenced by the fact that animals lacking the miRNA-producing enzyme Dicer fail to develop beyond gastrulation. Similarly, when Dicer is absent only

12.6 Translational Control

wanted” mRNAs, P-bodies can also act as sites where mRNAs that are no longer being translated are stored temporarily. In the alternate mRNA degradation pathway shown in Figure 12.62b, removal of the poly(A) tail (step 3a) is followed by continued digestion of the mRNA from its 3 end (step 4a). The digestion of mRNAs in the 3 → 5 direction is carried out by an exonuclease that is part of a complex of 3 → 5 exonucleases called the exosome. There must be more to mRNA longevity than simply the length of the poly(A) tail, as mRNAs having very different half-lives begin with a similar-sized tail. Once again, differences in the nucleotide sequence of the 3 UTR have been shown to play a role in the rate at which the poly(A) tail becomes shortened. The 3 UTR of a globin mRNA, for example, contains a number of CCUCC repeats that serve as binding sites for specific proteins that stabilize the mRNA. If these sequences are mutated, the mRNA is destabilized. In contrast, short-lived mRNAs often contain AU-rich elements (e.g., AUUUA repeats) in their 3 UTR that destabilize the message. If one of these destabilizing sequences is introduced into the 3 UTR of a globin gene, the stability of the mRNA transcribed from the modified gene is reduced from a half-life of 10 hours to a half-life of 90 minutes. The importance of these destabilizing sequences (and the overall mRNA instabil-

Figure 12.62 mRNA degradation in mammalian cells. (a, b) The steps depicted in these drawings are described in the text. (c) Fluorescence micrograph showing P-bodies (yellow) in the cytoplasm of HeLa cells. The P-bodies are revealed as the site of localization of GFP-DCP1, a protein involved in mRNA decapping. Nuclei are red. (C: FROM F. TRITSCHLER ET. AL., NATURE REVS. MOL. CELL BIOL. 11:380, 2010; BOX 1. REPRINTED BY PERMISSION OF MACMILLAN PUBLISHERS LTD. COURTESY OF D. LAZARETTI, MAX PLANCK INSTITUTE, TUEBIGEN.)

540 Deadenylation (followed by decapping and degradation)

Proteolysis (degradation of nascent peptide) X

miRNPs ORF

miRNPs AAAAA

P-body (mRNA storage or degradation)

AAAAA

miRNPs ORF

AAAAA miRNP binding

Initiation block (repressed cap recognition or 60S joining)

Elongation block (slowed elongation or ribosome ‘drop-off’)

miRNPs

miRNPs elF4E

ORF

AAAAA

Figure 12.63 miRNA mediated gene silencing. miRNAs, as part of an miRNP protein complex as illustrated in Figure 11.37, pair with sequence elements in the 3 UTR of target mRNAs. They suppress gene expression posttranscriptionally by either promoting mRNA deadenylation and degradation (upper left), inhibiting the initiation of translation (lower left), inhibiting translation elongation (lower right), or possibly activating degradation of nascent peptides (upper right). Some mRNA: miRNA pairs may be stored in cytoplasmic P-bodies

from a particular tissue, the development of the cells of that tissue displays obvious abnormalities. It is also becoming apparent that abnormalities in miRNA levels play a major role in the development of many common diseases. MicroRNAs are thought to exert their regulatory activity by multiple mechanisms as depicted in Figure 12.63.

Chapter 12 Control of Gene Expression

1. Most recent work supports a model that was first discov-

ered in zebrafish embryos that showed that miR-430 functioned to rid embryos of maternal mRNAs by inducing deadenylation and decay. In this model, miRNA pairing recruits exonucleases to the 3 end of an mRNA target, resulting in shortening of the poly(A) tail followed by degradation. As discussed on page 536, many maternal mRNAs are stored for use as embryos develop, but as those embryos begin to transcribe their own genomes, maternal mRNAs are actively destroyed. miR-430 is extremely abundant during this transition in zebrafish embryos, and this one miRNA binds to hundreds of mRNAs, accelerating their decay. This mechanism of miRNA-mediated mRNA degradation has since been demonstrated in numerous other organisms, including mice and humans, and is especially prevalent in plant cells. The widespread occurrence of this pathway of mRNA degradation is best revealed by comparing the numbers of particular mRNAs in cells under different

AAAAA

and, upon depression by removal of miRNAs, exit P-bodies and resume translation. ORF (open reading frame) corresponds to the amino acid coding segment of the mRNA. (W. FILIPOWICZ, ET AL., NAT. REVS GEN 9, 109, 2008, FIG. 3. NATURE REVIEWS GENETICS BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

conditions. It is generally found that mRNA levels are inversely correlated with the levels of complementary miRNAs. In other words, when a particular miRNA is introduced into a population of cells, the mRNAs that possess binding sites for that miRNA drop in number. Conversely, when the numbers of a particular miRNA are experimentally depleted, the abundance of mRNAs with binding sites for that miRNA concurrently increases. 2. A body of evidence also suggests that miRNAs can act by

inhibiting the translation of mRNAs. Most evidence points to miRNAs acting at the step where translation is initiated, but they may also act to block translation elongation. 3. Several early studies suggest that miRNAs may recruit

proteases that degrade nascent proteins during translation. In addition to playing an important role in overall gene regulation, miRNAs also appear to be important mediators of stress responses. One way in which miRNAs might facilitate a rapid response to stress is by blocking translation of specific mRNAs and holding those mRNAs in place until some external cue or stress releases the inhibition and allows translation to resume. In this model, repressed mRNAs are held in cytoplasmic P-bodies and under specific conditions leave the P-body and resume translation (Figure 12.63).

541

REVIEW 1. Describe three different ways in which gene expression might be controlled at the translational level. Cite an example of each of these control mechanisms. 2. What is the role of poly(A) in mRNA stability? How might the cell regulate the stability of different mRNAs? 3. Describe the different levels at which gene expression is regulated to allow a -globin gene with the following structure to direct the formation of a protein that accounts for over 95 percent of the protein of the cell.

exon 1¬intron¬exon 2¬intron¬exon 3 4. Describe some of the ways in which miRNAs might regulate gene expression.

12.7 | Posttranslational Control: Determining Protein Stability We have seen how cells possess elaborate mechanisms to control the rates at which proteins are synthesized. It is not unexpected that cells would also possess mechanisms to control the length of time that specific proteins survive once they are fully

functional. Although the subject of protein stability does not fall technically under the heading of control of gene expression, it is a logical extension of that topic. Pioneering studies in the area of selective protein degradation were carried out by Avram Hershko and Aaron Ciechanover in Israel and Irwin Rose and Alexander Varshavsky in the United States. Degradation of cellular proteins is carried out within hollow, cylindrical, protein-degrading machines called proteasomes that are found in both the nucleus and cytosol of cells. Proteasomes consist of four rings of polypeptide subunits stacked one on top of the other with a cap attached at either end of the stack (Figure 12.64a,b). The two central rings consist of polypeptides ( subunits) that function as proteolytic enzymes. The active sites of these subunits face the enclosed central chamber, where proteolytic digestion can occur in a protected environment. Proteasomes digest proteins that have been specifically selected and marked for destruction as described below. Some proteins are selected because they are recognized as abnormal— either misfolded or incorrectly associated with other proteins. Included in this group are abnormal proteins that had been produced on membrane-bound ribosomes of the rough ER (page 288). The selection of “normal” proteins for proteasome destruction is based on the protein’s biological stability. Every

cap

cap

α subunit

␣

1

β

2

α

α

β

β

β

β

α

α

3

β subunits

β

α subunit cap

(a)

(b)

Figure 12.64 Proteasome structure and function. (a) High-resolution electron micrograph of an isolated Drosophila proteasome. (b) Model of a proteasome based on high-resolution electron microscopy and X-ray crystallography. Each proteasome consists of two multisubunit caps (or regulatory particles) on either end of a tunnel-shaped cylinder (or core particle) that is formed by a stack of four rings. Each ring consists of seven subunits that are divided into two classes, -type and -type. The two inner rings are composed of subunits, which surround a central chamber. The subunits are drawn in different shades of color because they are similar, but not identical, polypeptides. Three of the seven subunits in each ring possess proteolytic activity; the other four are inactive in eukaryotic cells. The two outer rings are composed of enzymatically inactive subunits that form a narrow opening

α

α

α

α

α

α

β

β

β

β

β

β

β

β

β

β

β

β

α

α

α

α

α

α

5

4

(c)

(approximately 13 Å) through which unfolded polypeptide substrates are threaded to reach the central chamber, where they are degraded. (c) Steps in the degradation of proteins by a proteasome. In step 1, the protein to be degraded is linked covalently to a string of ubiquitin molecules. Ubiquitin attachment requires the participation of three distinct enzymes (E1, E2, and E3) in a process that is not discussed in the text. In step 2, the polyubiquitinated target protein binds to the cap of the proteasome. The ubiquitin chain is then removed, and the unfolded polypeptide is threaded into the central chamber of the proteasome (step 3), where it is degraded by the catalytic activity of the subunits (steps 4 and 5). (A: COURTESY H. HÖLZL AND WOLFGANG BAUMEISTER, J. CELL BIOL. 150:126, 2000. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

12.7 Posttranslational Control: Determining Protein Stability

␣

542

protein is thought to have a characteristic longevity or halflife. Some protein molecules, such as the enzymes of glycolysis or the globin molecules of an erythrocyte, are present for days to weeks. Other proteins that are required for a specific, fleeting activity, such as regulatory proteins that initiate DNA replication or trigger cell division, may survive only a few minutes. The destruction of such key regulatory proteins by proteasomes plays a crucial role in the progression of cellular processes (as illustrated in Figure 14.26). Velcade, a drug that specifically inhibits proteasomal digestion, has been approved for treatment of some forms of cancer. The factors that control a protein’s lifetime are not well understood. One of the determinants is the specific amino acid that resides at the N-terminus of a polypeptide chain. Polypeptides that terminate in arginine or lysine, for example, are typically short-lived. A number of proteins that act at specific times within the cell cycle are marked for destruction when certain residues become phosphorylated. Still other proteins carry a specific internal sequence of amino acids called a degron that ensures they will not survive long within the cell. Ubiquitin is a small, highly conserved protein with a number of functions in diverse cellular processes. We saw on page 312, for example, that membrane proteins bearing a sin-

gle attached ubiquitin molecule are selectively incorporated into endocytic vesicles. A single attached ubiquitin functions primarily as a sorting signal. In contrast, proteins are targeted for destruction by attachment of a polyubiquitin chain consisting of multiple ubiquitin molecules linked covalently to one another (step 1, Figure 12.64c). In the first stage of this process, ubiquitin is transferred enzymatically from a carrier protein to a lysine residue on the condemned protein. Enzymes that transfer ubiquitin to target proteins comprise a huge family of ubiquitin ligases in which different members recognize proteins bearing different degradation signals. These enzymes play a crucial role in determining the life or death of key proteins and are a focus of current research. Once it is polyubiquitinated, a protein is recognized by the cap of the proteasome (step 2, Figure 12.64c), which removes the polyubiquitin chain and unfolds the target protein using energy provided by ATP hydrolysis. The unfolded, linear polypeptide is then threaded through the narrow opening in the ring of subunits and passed into the central chamber of the proteasome (step 3), where it is digested into small peptides (steps 4 and 5). The peptide products are released back into the cytosol where they are degraded into their component amino acids.

Chapter 12 Control of Gene Expression

| Synopsis Flow of genetic information. All cells contain a full complement of genes but only express a subset of these genes depending on environmental cues, developmental stages, rate of growth, or signals from other cells. All gene expression begins with transcription of DNA to produce RNA transcripts that either function directly as RNAs or are subsequently translated into proteins. The regulation of gene expression can occur at multiple levels along these paths (p. 483). In bacteria, genes are organized into regulatory units called operons. Operons are clusters of structural genes that typically encode different enzymes in the same metabolic pathway. Because all of the structural genes are transcribed into a single mRNA, their expression can be regulated in a coordinated manner. The level of gene expression is controlled by a key metabolic compound, such as the inducer lactose, which attaches to a protein repressor and changes its shape. This event alters the ability of the repressor to bind to the operator site on the DNA, thereby blocking transcription. (p. 484) The nucleus of a eukaryotic cell is a complex structure bounded by the nuclear envelope, which controls the exchange of materials between the nucleus and cytoplasm, The nuclear envelope consists of several distinct components, including an inner and outer nuclear membrane separated by a perinuclear space and a variable number of nuclear pores. Nuclear pores are sites where the inner and outer nuclear membranes are fused to form a circular opening that is filled with a complex structure called the nuclear pore complex (NPC). The NPC has a basket-like structure with octagonal symmetry composed of rings, spokes, and filaments. The nuclear pores are the sites through which materials pass between the nucleus and cytoplasm. Proteins that normally reside within the nucleus contain a stretch of amino acids called the nuclear localization signal (NLS) that allows them to bind to a receptor (an importin) that transports them through the NPC. Nuclear transport is a process of diffusion facilitated by a gradient of the protein Ran, with Ran-GTP in the nucleus and Ran-GDP in the cytoplasm. The inner surface of the nuclear

envelope is lined by a fibrillar meshwork called the nuclear lamina, which consists of proteins (lamins) that are members of the family of proteins that make up intermediate filaments. The fluid of the nucleus is called the nucleoplasm. (p. 488) The chromosomes of the nucleus contain a defined complex of DNA and histone proteins that form characteristic nucleoprotein filaments, representing the first step in packaging the genetic material. Histones are small, basic proteins that are divided into five distinct classes. Histones and DNA are organized into nucleosome core complexes, which consist of two molecules each of histones H2A, H2B, H3, and H4 encircled by almost two turns of DNA. Nucleosome core particles are connected to one another by stretches of linker DNA. Together, the core particles and DNA linkers generate a nucleosome filament that resembles a chain of beads on a string. Covalent modifications to specific residues in the N-terminal tails of the core histones—including methylation, acetylation, and phosporylation—play a key role in determining the state of compaction and transcriptional activity of chromatin. (p. 493) Chromatin is not present in cells in the highly extended nucleosome filament but is compacted into higher levels of organization. Each nucleosome core particle contains a molecule of H1 histone bound to the DNA. Both core and H1 histones mediate an interaction between neighboring nucleosomes that generates a 30-nm fiber, which represents a higher level of chromatin organization. The 30nm fibers are in turn organized into looped domains, which are most readily visualized when mitotic chromosomes are subjected to procedures that remove histones. Mitotic chromosomes represent the most compact state of chromatin. A certain fraction of the chromatin, called heterochromatin, remains in a highly compact state throughout interphase. Heterochromatin is categorized as constitutive heterochromatin, which remains condensed in all cells at all times, and facultative heterochromatin, which is specifically inactivated during certain phases of a cell’s life. One X chromosome in

543 scription factors bind to DNA as dimers, either heterodimers or homodimers, that recognize sequences in the DNA that possess twofold symmetry. Most of the DNA-binding motifs contain a segment, often an helix, that is inserted into the major groove of the DNA, where it recognizes the sequence of base pairs that line the groove. Among the most common motifs that occur in DNAbinding proteins are the zinc finger, the helix–loop–helix, and the leucine zipper. Each of these motifs provides a structurally stable framework on which the specific DNA-recognizing surfaces of the protein can interact with the DNA double helix. Though most transcription factors are probably stimulatory, some act by inhibiting transcription. (p. 517) The activation and repression of transcription is mediated by a number of large complexes that function as coactivators or corepressors. Coactivators include complexes that serve as bridges between transcriptional activators bound at upstream regulatory sites and the basal transcription machinery bound at the core promoter. Other types of coactivators act by modifying core histones or by remodeling chromatin. Acetylation of core histones by histone acetyltransferases (HATs) is associated with transcriptional activation. Chromatin remodeling complexes (e.g., SWI/SNF) can cause nucleosomes to slide along the DNA or modify the nucleosome so as to increase its ability to bind regulatory proteins. (p. 525). Eukaryotic genes are silenced when the cytosine bases of certain nucleotides residing in GC-rich regions are methylated. Methylation is a dynamic, epigenetic modification. Addition of methyl groups to DNA in key regulatory regions upstream from genes correlates with a decrease in transcription of the gene. Methylation is particularly evident in chromatin that has been rendered transcriptionally inactive by heterochromatization, such as the inactive X chromosome in the cells of female mammals. Another phenomenon associated with DNA methylation is genomic imprinting, which affects a small number of genes that are either transcriptionally active or inactive in the embryo depending on their parental source. (p. 531) Alternative splicing in which a single gene can encode two or more related proteins allows relatively small genomes to encode a large diversity of proteins. Many primary transcripts can be processed by more than one pathway, producing mRNAs containing different combinations of exons. In the simplest case, a specific intron can be either spliced out of the transcript or retained as part of the final mRNA. The specific pathway taken during premRNA processing is thought to depend on the presence of proteins that control the splice sites that are recognized for cleavage. (p. 533) Gene expression is regulated at the translational level by various processes, including mRNA localization, control of translation of existing mRNAs, and mRNA longevity. Most of these regulatory activities are mediated by interactions with the 5 and 3 untranslated regions (UTRs) of the mRNA. For example, cytoplasmic localization of mRNAs in particular regions of the eggs of fruit flies is thought to be mediated by proteins that recognize localization sequences in the 3 UTR. The presence of an mRNA in the cytoplasm does not ensure its translation. The protein synthetic machinery of a cell may be controlled globally so that translation of all mRNAs is affected, or the translation of specific mRNAs can be targeted, as occurs in the control of ferritin synthesis by iron levels acting on proteins that bind to the 5 UTR of the ferritin mRNA. One of the primary factors controlling the longevity (stability) of mRNAs is the length of the poly(A) tail. Specific nucleases in the cell gradually shorten the poly(A) tail until a point is reached where a protective protein can no longer stay bound to the tail. Once the protein is lost,

Synopsis

each cell of a female mammal is subjected to an inactivation process during embryonic development that converts it into transcriptionally inactive, facultative heterochromatin. Because X inactivation occurs randomly, the paternally derived X chromosome is inactivated in half the cells of the embryo, while the maternally derived X chromosome is inactivated in the other half. As a result, adult females are genetic mosaics with respect to genes present on the X chromosome. Xchromosome inactivation is an example of an epigenetic modification because it is a transmissable alteration in chromatin structure and function that does not involve a change in DNA sequence. (p. 496) Mitotic chromosomes possess several clearly recognizable features. Mitotic chromosomes can be visualized by lysing cells that have been arrested in mitosis and then staining them to generate identifiable chromosomes with predictable banding patterns. Each mitotic chromosome contains a marked indentation, called the centromere, which contains highly repeated DNA sequences and serves as a site for the attachment of microtubules during mitosis. The ends of the chromosome are the telomeres, which are maintained from one cell generation to the next by a special enzyme, called telomerase, that contains RNA as an integral component. Telomeres play an important role in maintaining chromosome integrity. (p. 501) The nucleus is an ordered cellular compartment. Chromosomes or regions of chromosomes are confined to particular regions of the nucleus; the telomeres may be associated with the nuclear envelope; RNPs involved in pre-mRNA splicing are restricted to particular sites. Evidence indicates that the nonrandom distribution of chromosomes is arranged through chromosomal territories that are linked to the nuclear envelope. (page 510) The rate of synthesis of a particular polypeptide in eukaryotic cells is determined by a complex series of regulatory events that operate primarily at three distinct levels. (1) Transcriptional control mechanisms determine whether or not a gene is transcribed and, if so, how often; (2) RNA processing control mechanisms determine the path by which primary RNA transcripts are processed; and (3) translational control mechanisms determine whether a particular mRNA is actually translated and, if so, how often and for how long a period. (p. 512) All cells of a eukaryotic organism retain a full set of genetic information. Different genes are expressed by cells at different stages of development, by cells in different tissues, and by cells exposed to different stimuli. The transcription of a particular gene is controlled by transcription factors, which are proteins that bind to specific sequences located at sites outside of a gene’s coding region. The closest upstream regulatory sequence is the TATA box, which is a major component of the core promoter of many genes and the site of assembly of the preinitiation complex. The activity of proteins at the TATA box is thought to depend on interactions with other proteins bound to other sites, including various types of response elements and enhancers. Enhancers are distinguished by the fact that they can be moved from place to place within the DNA or even reversed in orientation. Some enhancers may be located tens of thousands of base pairs upstream from the gene whose transcription they stimulate. Proteins bound to an enhancer and a promoter are thought to be brought into contact by loops in the DNA. (p. 514) Determination of the three-dimensional structure of a number of complexes between transcription factors and DNA indicate that these proteins bind to the DNA through a limited number of structural motifs. Transcription factors typically contain at least two domains, one whose function is to recognize and bind to a specific sequence of base pairs in the DNA, and another whose function is to activate transcription by interacting with other proteins. Most tran-

544 the mRNA is degraded in either a 5 → 3 or 3 → 5 direction. Translation-level regulation may be mediated by microRNAs that

can inhibit either the initiation or progression of translation or promote mRNA degradation. (p. 536)

| Analytic Questions 1. Methylation of K9 of histone H3 (by an enzyme SUV39H1) is

2.

3.

4.

5.

6.

7.

8.

9.

Chapter 12 Control of Gene Expression

10.

11.

12.

associated with heterochromatinization and gene silencing. It has been reported that methylation of H3 by other enzymes can lead to transcriptional activation. How can methylation lead to opposite effects? How many copies of each type of core histone would it take to wrap the entire human genome into nucleosomes? How has evolution solved the problem of producing such a large number of proteins in a relatively short period of time? Suppose you discovered a temperature-sensitive mutant whose nucleus failed to accumulate certain nuclear proteins at an elevated (restrictive) temperature but continued to accumulate other nuclear proteins. What conclusions might you draw about nuclear localization and the nature of this mutation? Humans born with three X chromosomes but no Y chromosomes often develop into females of normal appearance. How many Barr bodies would you expect the cells of these women to have? Why? Suppose that X-inactivation were not a random process, but always led to the inactivation of the X chromosome derived from the father. What effect would you expect this to have on the phenotype of females? The chromosomes shown in Figure 12.22b were labeled by incubating the preparation with DNA fragments known to be specific for each of the chromosomes. Suppose one of the chromosomes in the field contained regions having two different colors. What might you conclude about this chromosome? What advantage might be gained by having transcripts synthesized and processed in certain regions of the nucleus rather than randomly throughout the nucleoplasm? Compare and contrast the effect of a deletion in the operator of the lactose operon with one in the operator of the tryptophan operon. If you were to find a mutant of E. coli that produced continuous polypeptide chains containing both -galactosidase and galactoside permease (encoded by the y gene), how might you explain how this happened? You suspect that a new hormone you are testing functions to stimulate myosin synthesis by acting at the transcriptional level. What type of experimental evidence would support this contention? Suppose you had conducted a series of experiments in which you had transplanted nuclei from several different adult tissues into an activated, enucleated mouse egg and found that the egg did not develop past the blastocyst stage. Could you conclude that the transplanted nucleus had lost genes that were required for postblastula development? Why or why not? What does this type of experiment tell you more generally about interpreting negative results? It was noted on page 523 that DNA footprinting allows isolation of DNA sequences that bind specific transcription factors.

13.

14.

15.

16.

17.

18.

19.

20.

Describe an experimental protocol to identify transcription factors that bind to an isolated DNA sequence. (You might consider the techniques discussed in Section 18.7.) How do you explain why enhancers can be moved around within the DNA without affecting their activity, whereas the TATA box can only operate at one specific site? Suppose that you are working with a cell that exhibits a very low level of protein synthesis, and you suspect the cells are subject to a global translational-control inhibitor. What experiment might you perform to determine whether this is the case? The signal sequences that direct the translocation of proteins into the endoplasmic reticulum are cleaved by a signal peptidase, whereas the NLSs and NESs required for movement of a protein into or out of the nucleus remain as part of that protein. Consider a protein such as hnRNPA1, which is involved in the export of mRNA to the cytoplasm. Why is it important that the transport signal sequences for this protein remain as part of the protein, whereas the signal sequence for ER proteins can be cleaved? When methylated DNA is introduced into cultured mammalian cells, it is generally transcribed for a period before it becomes repressed. Why would you expect this type of delay before inhibition of transcription would occur? Suppose you had isolated a new transcription factor and wanted to know which genes this protein might regulate. Is there any way that you could use a cDNA microarray of the type shown in Figure 12.35 to approach this question? (Note: the microarray of Figure 12.35 contains the DNA of protein-coding regions, unlike that of Figure 12.46). Although several different mammalian species have been cloned, the efficiency of this process is extremely low. Often tens or even hundreds of oocytes must be implanted with donor nuclei to obtain one healthy live birth. Many researchers believe the difficulties with cloning reside in the epigenetic modifications, such as DNA and histone methylation, that occur within various cells during an individual’s life. How do you suspect such modifications might affect the success of an experiment such as that depicted in Figure 12.32? A study in a British medical journal found there was a correlation in telomere length between fathers and their daughters and between mothers and both their sons and daughters but not between fathers and their sons. How can you explain this finding? Some scientific reports are best described as correlations, in that they report on two events or conditions that tend to accompany one another. Correlations are often interpreted as evidence of a causal relationship between the two events or conditions. Other scientific reports involve experimental intervention and generally make a stronger case for a causal relationship. Pick one scientific conclusion that is put forth in this chapter that is based on both types of evidence. Which type of report do you find more convincing?

545

13 DNA Replication and Repair 13.1 DNA Replication 13.2 DNA Repair 13.3 Between Replication and Repair THE HUMAN PERSPECTIVE: The Consequences of DNA Repair Deficiencies

R

eproduction is a fundamental property of all living systems. The process of reproduction can be observed at several levels: organisms duplicate by asexual or sexual reproduction; cells duplicate by cellular division; and the genetic material duplicates by DNA replication. The machinery that replicates DNA is also called into action in another capacity: to repair the genetic material after it has sustained damage. These two processes—DNA replication and DNA repair—are the subjects addressed in this chapter. The capacity for self-duplication is presumed to have been one of the first critical properties to have appeared in the evolution of the earliest primitive life forms. Without the ability to propagate, any primitive assemblage of biological molecules would be destined for oblivion. The early carriers of genetic information were probably RNA molecules that were able to self-replicate. As evolution progressed and RNA molecules were replaced by DNA molecules as the genetic material, the process of replication became more complex, requiring a large number of auxiliary components. Thus, although a DNA molecule contains the information for its own duplication, it lacks the ability to perform the activity itself. As Richard Lewontin expressed it, “the common image of DNA as a self-replicating molecule is about as true as describing a letter as a self-replicating document. The letter needs a photocopier; the DNA needs a cell.” Let us see then how the cell carries out this activity.

Three-dimensional model of a DNA helicase encoded by the bacteriophage T7. The protein consists of a ring of six subunits. Each subunit contains two domains. In this model, the central hole encircles only one of the two DNA strands. Driven by ATP hydrolysis, the protein moves in a 5⬘ → 3⬘ direction along the strand to which it is bound, displacing the complementary strand and unwinding the duplex. DNA helicase activity is required for DNA replication. (COURTESY OF EDWARD H. EGELMAN, UNIVERSITY OF VIRGINIA.)

546

13.1 | DNA Replication The proposal for the structure of DNA by Watson and Crick in 1953 was accompanied by a suggested mechanism for its “self-duplication.” The two strands of the double helix are held together by hydrogen bonds between the bases. Individually, these hydrogen bonds are weak and readily broken. Watson and Crick envisioned that replication occurred by gradual separation of the strands of the double helix (Figure 13.1), much like the separation of two halves of a zipper. Because the two strands are complementary to each other, each strand contains the information required for construction of the other strand. Thus once the strands are separated, each can act as a template to direct the synthesis of the complementary strand and restore the double-stranded state.

1

Parental DNA molecules

DNA molecules from 1st generation progeny

DNA molecules from 2nd generation progeny

Semiconservative Replication The Watson-Crick proposal shown in Figure 13.1 made certain predictions concerning the behavior of DNA during replication. According to the proposal, each of the daughter duplexes should consist of one complete strand inherited from the parental duplex and one complete strand that has been newly synthesized. Replication of this type (Figure 13.2, scheme 1) is said to be semiconservative because each daughter duplex contains one strand from the parent structure. In the absence of information on the mechanism responsible for replication, two other types of replication had to be considered. In conservative replication (Figure 13.2, scheme 2), the two Old

Semiconservative replication

2

Parental DNA molecules

DNA molecules from 1st generation progeny

Old T

A

T

A A

DNA molecules from 2nd generation progeny

T C

G C

G A

T C G

Conservative replication

A T G C C G A T

3

Parental DNA molecules

T A G

C

Chapter 13 DNA Replication and Repair

A

T

C

G

C C

G

New

T A A

G A

Old New

DNA molecules from 1st generation progeny

T A A

T T

T

New

G

A C

T T

G

A T C T

G A

A C

G

A T C

New Old

Figure 13.1 The original Watson-Crick proposal for the replication of a double-helical molecule of DNA. During replication, the double helix unwinds, and each of the parental strands serves as a template for the synthesis of a new complementary strand. As discussed in this chapter, these basic tenets have been borne out.

DNA molecules from 2nd generation progeny

Dispersive replication

Figure 13.2 Three alternate schemes of replication. Semiconservative replication is depicted in scheme 1, conservative replication in scheme 2, and dispersive replication in scheme 3. A description of the three alternate modes of replication is given in the text.

547

Transfer to 14N

Generations Light

Parental

Hybrid Heavy

original strands would remain together (after serving as templates), as would the two newly synthesized strands. As a result, one of the daughter duplexes would contain only parental DNA, while the other daughter duplex would contain only newly synthesized DNA. In dispersive replication (Figure 13.2, scheme 3), the parental strands would be broken into fragments, and the new strands would be synthesized in short segments. Then the old fragments and new segments would be joined together to form a complete strand. As a result, the daughter duplexes would contain strands that were composites of old and new DNA. At first glance, dispersive replication might seem like an unlikely solution, but it appeared to Max Delbrück at the time as the only way to avoid the seemingly impossible task of unwinding two intertwined strands of a DNA duplex as it replicated (discussed on page 550).

To decide among these three possibilities, it was necessary to distinguish newly synthesized DNA strands from the original DNA strands that served as templates. This was accomplished in studies on bacteria in 1957 by Matthew Meselson and Franklin Stahl of the California Institute of Technology who used heavy (15N) and light (14N) isotopes of nitrogen to distinguish between parental and newly synthesized DNA strands (Figure 13.3). These researchers grew bacteria in medium containing 15N-ammonium chloride as the sole nitrogen source. Consequently, the nitrogen-containing bases of the DNA of these cells contained only the heavy nitrogen isotope. Cultures of “heavy” bacteria were washed free of the old medium and incubated in fresh medium with light, 14N-containing compounds, and samples were removed at increasing intervals over a period of several generations. DNA was extracted from the samples of bacteria and subjected to equilibrium densitygradient centrifugation (see Figure 18.35). In this procedure, the DNA is mixed with a concentrated solution of cesium chloride and centrifuged until the double-stranded DNA molecules reach equilibrium according to their density.

I

Generations

Light 14N DNA Hybrid 14N15N DNA

II

0 (parental) Heavy 15N DNA

III

0.3

Semiconservative

0.7

I

1.0 II

1.1 III

1.5

Conservative

1.9

I

2.5 II

3.0 III Dispersive

4.1 (b)

Figure 13.3 Experiment demonstrating that DNA replication in bacteria is semiconservative. DNA was extracted from bacteria at different stages in the experiment, mixed with a concentrated solution of the salt cesium chloride (CsCl), placed into a centrifuge tube, and centrifuged to equilibrium at high speed in an ultracentrifuge. Cesium ions have sufficient atomic mass to be affected by the centrifugal force, and they form a density gradient during the centrifugation period with the lowest concentration (lowest density) of Cs at the top of the tube and the greatest concentration (highest density) at the bottom of the tube. During centrifugation, DNA fragments within the tube become localized at a position having a density equal to their own density, which in turn depends on the ratio of 15N/14N that is present in their

nucleotides. The greater the 14N content, the higher in the tube the DNA fragment is found at equilibrium. (a) The results expected in this type of experiment for each of the three possible schemes of replication. The single tube on the left indicates the position of the parental DNA and the positions at which totally light or hybrid DNA fragments would band. (b) Experimental results obtained by Meselson and Stahl. The appearance of a hybrid band and the disappearance of the heavy band after one generation eliminates conservative replication. The subsequent appearance of two bands, one light and one hybrid, eliminates the dispersive scheme. (B: FROM M. MESELSON AND F. STAHL, PROC. NAT’L. ACAD. SCI. U.S.A. 44:671, 1958. COURTESY OF MATTHEW MESELSON.)

13.1 DNA Replication

(a)

548

In the Meselson-Stahl experiment, the density of a DNA molecule is directly proportional to the percentage of 15N or 14 N atoms it contains. If replication is semiconservative, one would expect that the density of DNA molecules would decrease during culture in the 14N-containing medium in the manner shown in the upper set of centrifuge tubes of Figure 13.3a. After one generation, all DNA molecules would be 15 N-14N hybrids, and their buoyant density would be halfway between that expected for totally heavy and totally light DNA (Figure 13.3a). As replication continued beyond the first generation, the newly synthesized strands would continue to contain only light isotopes, and two types of duplexes would appear in the gradients: those containing 15N–14N hybrids and those containing only 14N. As the time of growth in the light medium continued, a greater and greater percentage of the DNA molecules present would be light. However, as long

Chromosome

DNA strand

as replication continued semiconservatively, the original heavy parental strands would remain intact and present in hybrid DNA molecules that occupied a smaller and smaller percentage of the total DNA (Figure 13.3a). The results of the density-gradient experiments obtained by Meselson and Stahl are shown in Figure 13.13b, and they demonstrate unequivocally that replication occurs semiconservatively. The results that would have been obtained if replication occurred by conservative or dispersive mechanisms are indicated in the two lower sets of centrifuge tubes of Figure 13.3a.1 By 1960, replication had been demonstrated to occur semiconservatively in eukaryotes as well. The original experiments were carried out by J. Herbert Taylor of Columbia University. The drawing and photograph of Figure 13.4 show the results of a more recent experiment in which cultured mammalian cells were allowed to undergo replication in bromodeoxyuridine (BrdU), a compound that is incorporated into DNA in place of thymidine. Following replication, a chromosome is made up of two chromatids. After one round of replication 1

Anyone looking to explore the circumstances leading up to this heralded research effort and examine the experimental twists and turns as they unfolded might want to read the book Meselson, Stahl, and the Replication of DNA by Frederick Lawrence Holmes, 2001. A discussion of the experiment can also be found in PNAS 101:17889, 2004, which is on the Web.

Chromosome contains only thymidine Replicates in BrdU

Chromatids

Both chromatids contain one strand with BrdU and one strand with thymidine

Chapter 13 DNA Replication and Repair

Continued replication in BrdU-containing medium

One chromatid of each chromosome contains thymidine (a)

Figure 13.4 Experimental demonstration that DNA replication occurs semiconservatively in eukaryotic cells. (a) Schematic diagram of the results of an experiment in which cells were transferred from a medium containing thymidine to one containing bromodeoxyuridine (BrdU) and allowed to complete two successive rounds of replication. DNA strands containing BrdU are shown in red. (b) The results of an experiment similar to that shown in a. In this experiment, cultured mammalian cells were grown in BrdU for two rounds of replication before mitotic chromosomes were prepared and stained by a procedure using fluorescent dyes and Giemsa stain. Using this procedure,

(b)

chromatids containing thymidine within one or both strands stain darkly, whereas chromatids containing only BrdU stain lightly. The photograph indicates that, after two rounds of replication in BrdU, one chromatid of each duplicated chromosome contains only BrdU, while the other chromatid contains a strand of thymidine-labeled DNA. (Some of the chromosomes are seen to have exchanged homologous portions between sister chromatids. This process of sister chromatid exchange is common during mitosis but is not discussed in the text.) (B: COURTESY OF SHELDON WOLFF.)

549

in BrdU, both chromatids of each chromosome contained BrdU (Figure 13.4a). After two rounds of replication in BrdU, one chromatid of each chromosome was composed of two BrdU-containing strands, whereas the other chromatid was a hybrid consisting of a BrdU-containing strand and a thymidine-containing strand (Figure 13.4a,b). The thymidine-containing strand had been part of the original parental DNA molecule prior to addition of BrdU to the culture.

Replication fork

Origin

Daughter strand Parental strand

Replication in Bacterial Cells We will focus in this section of the chapter on replication in bacterial cells, which is better understood than the corresponding process in eukaryotes. The early progress in bacterial research was driven by genetic and biochemical approaches including: ■

■

The availability of mutants that cannot synthesize one or another protein required for the replication process. The isolation of mutants unable to replicate their chromosome may seem paradoxical: how can cells with a defect in this vital process be cultured? This paradox was solved by the isolation of temperature-sensitive (ts) mutants, in which the deficiency only reveals itself at an elevated temperature, termed the nonpermissive (or restrictive) temperature. When grown at the lower ( permissive) temperature, the mutant protein can function sufficiently well to carry out its required activity, and the cells can continue to grow and divide. Temperature-sensitive mutants have been isolated that affect virtually every type of physiologic activity (see also page 275), and they have been particularly important in the study of DNA synthesis as it occurs in replication, DNA repair, and genetic recombination. The development of in vitro systems in which replication can be studied using purified cellular components. In some studies, the DNA molecule to be replicated is incubated with cellular extracts from which specific proteins suspected of being essential have been removed. In other studies, the DNA is incubated with a variety of purified proteins whose activity is to be tested.

Taken together, these approaches have revealed the activity of more than 30 different proteins that are required to replicate the chromosome of E. coli. In the following pages, we will discuss the activities of several of these proteins whose functions have been clearly defined. Replication in bacteria and eukaryotes occurs by very similar mechanisms, and thus most of the information presented in the discussion of bacterial replication applies to eukaryotic cells as well.

2

The subject of initiation of replication is discussed in detail on page 559 as it occurs in eukaryotes.

Figure 13.5 Model of a circular bacterial chromosome undergoing bidirectional, semiconservative replication. Two replication forks move in opposite directions from a single origin. When the replication forks meet at the opposite point on the circle, replication is terminated, and the two replicated duplexes detach from one another. New DNA strands are shown in red.

directions, that is, bidirectionally (Figure 13.5). The sites in Figure 13.5 where the pair of replicated segments come together and join the nonreplicated DNA are termed replication forks. Each replication fork corresponds to a site where (1) the parental double helix is undergoing strand separation, and (2) nucleotides are being incorporated into the newly synthesized complementary strands. The two replication forks move in opposite directions until they meet at a point across the circle from the origin, where replication is terminated. The two newly replicated duplexes detach from one another and are ultimately directed into two different cells. Unwinding the Duplex and Separating the Strands Separation of the strands of a circular, helical DNA duplex poses major topological problems. To visualize the difficulties,

13.1 DNA Replication

Replication Forks and Bidirectional Replication Replication begins at a specific site on the bacterial chromosome called the origin. The origin of replication on the E. coli chromosome is a specific sequence called oriC where a number of proteins bind to initiate the process of replication.2 Once initiated, replication proceeds outward from the origin in both

Replication fork

550

we can briefly consider an analogy between a DNA duplex and a two-stranded helical rope. Consider what would happen if you placed a linear piece of this rope on the ground, took hold of the two strands at one end, and began to pull the strands apart just as DNA is pulled apart during replication. It is apparent that separation of the strands of a double helix is also a process of unwinding the structure. In the case of a rope, which is free to rotate around its axis, separation of the strands at one end would be accompanied by rotation of the entire fiber as it resisted the development of tension. Now, consider what would happen if the other end of the rope were attached to a hook on a wall (Figure 13.6a). Under these circumstances, separation of the two strands at the free end would generate increasing torsional stress in the rope and cause the unseparated portion to become more tightly wound. Separation of the two strands of a circular DNA molecule (or a linear molecule that is not free to rotate, as is the case in a large eukaryotic chromosome) is analogous to having one end of a linear molecule attached to a wall; in all of these cases, tension that develops in the molecule cannot be relieved by rotation of the entire molecule. Unlike a rope, which can become tightly overwound (as in Figure 13.6a), an overwound DNA molecule becomes positively supercoiled (page 397). Consequently, movement of the replication fork generates positive supercoils in the unreplicated portion of the DNA ahead of the fork

(a)

Chapter 13 DNA Replication and Repair

Point of attachment of DNA

Replication machinery (b)

Figure 13.6 The unwinding problem. (a) The effect of unwinding a two-stranded rope that has one end attached to a hook. The unseparated portion becomes more tightly wound. (b) When a circular or attached DNA molecule is replicated, the DNA ahead of the replication machinery becomes overwound and accumulates positive supercoils. Cells possess topoisomerases, such as the E. coli DNA gyrase, that remove positive supercoils. (B: FROM J. C. WANG, NATURE REVIEWS MOL. CELL BIOL. 3:434, 2002, COPYRIGHT 2002. NATURE REVIEWS MOLECULAR CELL BIOLOGY BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

(Figure 13.6b). When one considers that a complete circular chromosome of E. coli contains approximately 400,000 turns and is replicated by two forks within 40 minutes, the magnitude of the problem becomes apparent. It was noted on page 398 that cells contain enzymes, called topoisomerases, that can change the state of supercoiling in a DNA molecule. One enzyme of this type, called DNA gyrase, a type II topoisomerase, relieves the mechanical strain that builds up during replication in E. coli. DNA gyrase molecules travel along the DNA ahead of the replication fork, removing positive supercoils. DNA gyrase accomplishes this feat by cleaving both strands of the DNA duplex, passing a segment of DNA through the double-stranded break to the other side, and then sealing the cuts, a process that is driven by the energy released during ATP hydrolysis (shown in detail in Figure 10.14b). Eukaryotic cells possess similar enzymes that carry out this required function. The Properties of DNA Polymerases We begin our discussion of the mechanism of DNA replication by describing some of the properties of DNA polymerases, the enzymes that synthesize new DNA strands. Study of these enzymes was begun in the 1950s by Arthur Kornberg at Washington University. In their initial experiments, Kornberg and his colleagues purified an enzyme from bacterial extracts that incorporated radioactively labeled DNA precursors into an acid-insoluble polymer identified as DNA. The enzyme was named DNA polymerase (and later, after the discovery of additional DNA-polymerizing enzymes, it was named DNA polymerase I). For the reaction to proceed, the enzyme required the presence of DNA and all four deoxyribonucleoside triphosphates (dTTP, dATP, dCTP, and dGTP). The newly synthesized, radioactively labeled DNA had the same base composition as the original unlabeled DNA, which strongly suggested that the original DNA strands had served as templates for the polymerization reaction. As additional properties of the DNA polymerase were uncovered, it became apparent that replication was more complex than previously thought. When various types of template DNAs were tested, it was found that the template DNA had to meet certain structural requirements if it was to promote the incorporation of labeled precursors (Figure 13.7). An intact, double-stranded DNA molecule, for example, did not stimulate incorporation. This was not surprising considering the requirement that the strands of the helix must be separated for replication to occur. It was less obvious why a single-stranded, circular molecule was also devoid of activity; one might expect this structure to be an ideal template to direct the manufacture of a complementary strand. In contrast, addition of a partially double-stranded DNA molecule to the reaction mixture produced an immediate incorporation of nucleotides. It was soon discovered that a single-stranded DNA circle cannot serve as a template for DNA polymerase because the enzyme cannot initiate the formation of a DNA strand. Rather, it can only add nucleotides to the 3⬘ hydroxyl terminus of an existing strand. The strand that provides the necessary 3⬘ OH terminus is called a primer. All DNA polymerases— both prokaryotic and eukaryotic—have these same two basic requirements (Figure 13.8a): a template DNA strand to copy

551

and a primer strand to which nucleotides can be added. These requirements explain why certain DNA structures fail to promote DNA synthesis (Figure 13.7a). An intact, linear double helix provides the 3⬘ hydroxyl terminus but lacks a template. A circular single strand, on the other hand, provides a template but lacks a primer. The partially double-stranded molecule (Figure 13.7b) satisfies both requirements and thus promotes nucleotide incorporation. The finding that DNA polymerase cannot initiate the synthesis of a DNA strand raises a critical question: how is the synthesis of a new strand initiated in the cell? We will return to this question shortly. The DNA polymerase purified by Kornberg had another property that was difficult to understand in terms of its presumed role as a replicating enzyme: it only synthesized DNA in a 5⬘-to-3⬘ (written 5⬘ → 3⬘) direction. As Watson and Crick first discovered, the two strands of a DNA helix have an antiparallel orientation. The diagram of DNA replication first presented by Watson and Crick (see Figure 13.1) depicted events as they would be expected to occur at the replication fork. The diagram suggested that one of the newly synthesized strands is polymerized in a 5⬘ → 3⬘ direction, while the other strand is polymerized in a 3⬘ → 5⬘ direction. Is there some other enzyme responsible for the construction of the 3⬘ → 5⬘ strand? Does the enzyme work differently in the cell than under in vitro conditions? We will return to this question as well. During the 1960s, there were hints that the “Kornberg enzyme” was not the only DNA polymerase in a bacterial cell. Then in 1969, a mutant strain of E. coli was isolated that had less than 1 percent of the normal activity of the enzyme, yet

5' 3'

3'

5'

3'

5'

3'

5'

5' 3'

5' 3' 3'

5'

5'

3' Nick

(b)

Figure 13.7 Templates and nontemplates for DNA polymerase activity. (a) Examples of DNA structures that do not stimulate the synthesis of DNA in vitro by DNA polymerase isolated from E. coli. (b) Examples of DNA structures that stimulate the synthesis of DNA in vitro. In all cases, the molecules in b contain a template strand to copy and a primer strand with a 3⬘ OH on which to add nucleotides.

5'

3'

Primer P

P

Primer 5'

Template 3'

O

Base

Base

Base

Base

Base

Base

O

(a)

O

5'

O

O

O-

P S

OHO

γP

P

O

A

OH

T

OO P O

β

O

O

OP O O

α

N

A OH

N

O

3' 5'

O

Base

Base

P

Base Base

5'

(a)

N

S

3' Growing DNA strand

N

_

O

T

HN

N

Base

DNA template

Base

(b)

5'

DNA polymerase New DNA strands under construction

5' 3'

5'

(c)

the primer, facilitating the nucleophilic attack of the negatively charged 3⬘ oxygen atom on the ␣ phosphate of the incoming nucleoside triphosphate. The second magnesium ion stabilizes the pyrophosphate, promoting its release. The two metal ions are bound to the enzyme by highly conserved aspartic acid residues of the active site. (c) Schematic diagram showing the direction of movement of each polymerase along the two template strands.

13.1 DNA Replication

Figure 13.8 The activity of a DNA polymerase. (a) The polymerization of a nucleotide onto the 3⬘ end of the primer strand. The enzyme selects nucleotides for incorporation based on their ability to pair with the nucleotide of the template strand. (b) A simplified model of the two-metal ion mechanism for the reaction in which nucleotides are incorporated into a growing DNA strand by a DNA polymerase. In this model, one of the magnesium ions draws the proton away from the 3⬘ hydroxyl group of the terminal nucleotide of

O

C

NH2

O

G

O CH3 O P

O

Mg2+

Mg2+

O

S OH .. P P P S

3'

O

P

S

O

T

O

A

O

S

552

Chapter 13 DNA Replication and Repair

was able to multiply at the normal rate. Further studies revealed that the Kornberg enzyme, or DNA polymerase I, was only one of several distinct DNA polymerases present in bacterial cells. The major enzyme responsible for DNA replication (i.e., the replicative polymerase) is DNA polymerase III. A typical bacterial cell contains 300 to 400 molecules of DNA polymerase I but only about 10 copies of DNA polymerase III. The presence of DNA polymerase III had been masked by the much greater amounts of DNA polymerase I in the cell. But the discovery of other DNA polymerases did not answer the two basic questions posed above; none of the enzymes can initiate DNA chains, nor can any of them construct strands in a 3⬘ → 5⬘ direction. Semidiscontinuous Replication The lack of polymerization activity in the 3⬘ → 5⬘ direction has a straightforward explanation: DNA strands cannot be synthesized in that direction. Rather, both newly synthesized strands are assembled in a 5⬘ → 3⬘ direction. During the polymerization reaction, the —OH group at the 3⬘ end of the primer carries out a nucleophilic attack on the 5⬘ ␣-phosphate of the incoming nucleoside triphosphate, as shown in Figure 13.8b. The polymerase molecules responsible for construction of the two new strands of DNA both move in a 3⬘-to-5⬘ direction along the template, and both construct a chain that grows from its 5⬘-P terminus (Figure 13.8c). Consequently, one of the newly synthesized strands grows toward the replication fork where the parental DNA strands are being separated, while the other strand grows away from the fork. Although this solves the problem concerning an enzyme that synthesizes a strand in only one direction, it creates an even more complicated dilemma. It is apparent that the strand that grows toward the fork in Figure 13.8c can be constructed by the continuous addition of nucleotides to its 3⬘ end. But how is the other strand synthesized? Evidence was soon gathered to indicate that the strand that grows away from the replication fork is synthesized discontinuously, that is, as fragments (Figure 13.9). Before the synthesis of a fragment can be initiated, a suitable stretch of template must be exposed by movement of the replication fork. Once initiated, each fragment grows away from the replication fork toward the 5⬘ end of a previously synthesized fragment to which it is subsequently linked. Thus, the two newly synthesized strands of the daughter duplexes are synthesized by very different processes. The strand that is synthesized continuously is called the leading strand because its synthesis continues as the replication fork advances. The strand that is synthesized discontinuously is called the lagging strand because initiation of each fragment must wait for the parental strands to separate and expose additional template (Figure 13.9). As discussed on page 554, both strands are probably synthesized simultaneously, so that the terms leading and lagging may not be as appropriate as thought when they were first coined. Because one strand is synthesized continuously and the other discontinuously, replication is said to be semidiscontinuous. The discovery that one strand was synthesized as small fragments was made by Reiji Okazaki of Nagoya University, Japan, following various types of labeling experiments.

Okazaki found that if bacteria were incubated in [3H]thymidine for a few seconds and immediately killed, most of the radioactivity could be found as part of small DNA fragments 1000 to 2000 nucleotides in length. In contrast, if cells were incubated in the labeled DNA precursor for a minute or two, most of the incorporated radioactivity became part of much larger DNA molecules (Figure 13.10). These results indicated that a portion of the DNA was constructed in small segments (later called Okazaki fragments) that were rapidly linked to longer pieces that had been synthesized previously. The enzyme that joins the Okazaki fragments into a continuous strand is called DNA ligase. The discovery that the lagging strand is synthesized in pieces raised a new set of perplexing questions about the initiation of DNA synthesis. How does the synthesis of each of these fragments begin when none of the DNA polymerases are capable of strand initiation? Further studies revealed that initiation is not accomplished by a DNA polymerase but, rather, by a distinct type of RNA polymerase, called primase, that constructs a short primer composed of RNA, not DNA. The leading strand, whose synthesis begins at the origin of replication, is also initiated by a primase molecule. The short RNAs synthesized by the primase at the 5⬘ end of the leading strand and at the 5⬘ end of each Okazaki fragment serve as the required primer for the synthesis of DNA by a DNA poly-

3' 5' Leading strand template Leading strand Replication fork Lagging strand template 3' 5'

Lagging strand

3' 5'

3' 5'

Figure 13.9 The two strands of a double helix are synthesized by a different sequence of events. DNA polymerase molecules move along a template only in a 3⬘ → 5⬘ direction. As a result, the two newly assembled strands grow in opposite directions, one growing toward the replication fork and the other growing away from it. One strand is assembled in continuous fashion, the other as fragments that are joined together enzymatically. The diagram shown here depicts the differences in synthesis of the two strands.

553 Sedimentation velocity 20

40

60 S

3' 5' Leading strand

120 sec

5' 3'

RNA Lagging strand 1

Radioactivity (103 cts/min per 0.1 ml)

3' 5'

2

60 sec

RNA primer synthesis by primase

Elongation by DNA polymerase III

P OH 3

30 sec

Primer removal and gap filling by DNA polymerase I

15 sec 7 sec 4

2 sec 1

2

3

Distance from top

Figure 13.10 Results of an experiment showing that part of the DNA is synthesized as small fragments. Sucrose density gradient profiles of DNA from a culture of phage-infected E. coli cells. The cells were labeled for increasing amounts of time, and the sedimentation velocity of the labeled DNA was determined. When DNA was prepared after very short pulses, a significant percentage of the radioactivity appeared in very short pieces of DNA (represented by the peak near the top of the tube on the left). After periods of 60–120 seconds, the relative height of this peak falls as labeled DNA fragments become joined to the ends of high-molecular-weight molecules. (FROM R. OKAZAKI ET AL., COLD SPRING HARBOR SYMP. QUANT. BIOL. 33:130, 1968. REPRINTED WITH PERMISSION FROM COLD SPRING HARBOR LABORATORY PRESS.)

merase. The RNA primers are subsequently removed, and the resulting gaps in the strand are filled with DNA and then sealed by DNA ligase. These events are illustrated schematically in Figure 13.11. The formation of transient RNA primers during the process of DNA replication is a curious activity. It is thought that the likelihood of mistakes is greater during initiation than during elongation, and the use of a short removable segment of RNA avoids the inclusion of mismatched bases.

Figure 13.11 The use of short RNA fragments as removable primers in initiating synthesis of each Okazaki fragment of the lagging strand. The major steps are indicated in the drawing and discussed in the text. The roles of various accessory proteins in these activities are indicated in the following figures.

plex in a reaction that uses energy released by ATP hydrolysis to move along one of the DNA strands, breaking the hydrogen bonds that hold the two strands together and exposing the single-stranded DNA templates. E. coli has at least 12 different helicases for use in various aspects of DNA (and RNA) metabolism. One of these helicases—the product of the dnaB gene—serves as the major unwinding machine during replication. The DnaB helicase consists of six subunits arranged to form a ring-shaped protein that encircles a single DNA strand (Figure 13.12a). Initiation of replication begins in E. coli when multiple copies of the DnaA protein bind to the origin of replication (oriC) and separate (melt) the DNA strands at that site. The DnaB helicase is then loaded onto the singlestranded DNA of the lagging strand of oriC, with the help of the protein DnaC. The DnaB helicase then translocates in a 5⬘ → 3⬘ direction along the lagging-strand template, unwinding the helix as it proceeds (Figure 13.12). A threedimensional model of a similar shaped bacteriophage helicase engaged in strand separation during replication is depicted on page 545. DNA unwinding by the helicase is aided by the attachment of SSB proteins to the separated DNA strands (Figure 13.12). These proteins bind selectively to single-stranded DNA, keeping it in an extended state and preventing it from becoming rewound or damaged. A visual portrait of the combined action of a DNA helicase and SSB proteins on the

13.1 DNA Replication

The Machinery Operating at the Replication Fork Replication involves more than incorporating nucleotides. Unwinding the duplex and separating the strands require the aid of two types of proteins that bind to the DNA, a helicase (or DNA unwinding enzyme) and single-stranded DNAbinding (SSB) proteins. DNA helicases unwind a DNA du-

Strand sealed by DNA ligase

554 Primase DNA helicase

5'

3'

Lagging strand RNA Primer

5'

Movement of helicase Single-stranded DNA binding protein (SSB)

DNA Duplex

Leading strand 3'

(a)

Chapter 13 DNA Replication and Repair

DNA Helicase

(b)

Unwound DNA Strand with SSB Proteins

200 nm

Figure 13.12 The roles of the DNA helicase, single-stranded DNA-binding proteins, and primase at the replication fork. (a) The helicase moves along the DNA, catalyzing the ATPdriven unwinding of the duplex. As the DNA is unwound, the strands are prevented from reforming the duplex by tetrameric single-stranded DNA-binding proteins (SSBs). The primase associated with the helicase synthesizes the RNA primers that begin each Okazaki fragment. The RNA primers, which are about 10 nucleotides long, are

subsequently removed. (b) A series of five electron micrographs showing DNA molecules incubated with a viral DNA helicase (T antigen, page 561) and E. coli SSB proteins. The DNA molecules are progressively unwound from left to right. The helicase appears as the round particle at the fork, and the SSB proteins are bound to the single-stranded ends, giving them a thickened appearance. (B: FROM RAINER WESSEL, JOHANNES SCHWEIZER, AND HANS STAHL, J. VIROL. 66:807, 1992. © AMERICAN SOCIETY FOR MICROBIOLOGY.)

structure of the DNA double helix is illustrated in the electron micrographs of Figure 13.12b. Recall that an enzyme called primase initiates the synthesis of each Okazaki fragment. In bacteria, the primase and the helicase associate transiently to form what is called a “primosome.” Of the two members of the primosome, the helicase moves along the lagging-strand template processively (i.e., without being released from the template strand during the lifetime of the replication fork). As the helicase “motors” along the lagging-strand template, opening the strands of the duplex, the primase periodically binds to the helicase and synthesizes the short RNA primers that begin the formation of each Okazaki fragment. As noted above, the RNA primers are subsequently extended as DNA by a DNA polymerase, specifically DNA polymerase III. A body of evidence suggests that the same DNA polymerase III molecule synthesizes successive fragments of the lagging strand. To accomplish this, the polymerase III molecule is recycled from the site where it has just completed one Okazaki fragment to the next site along the lagging-strand template closer to the site of DNA unwinding. Once at the new site, the polymerase attaches to the 3⬘ OH of the RNA primer that has just been laid down by a primase and begins to incorporate deoxyribonucleotides onto the end of the short RNA. How does a polymerase III molecule move from one site on the lagging-strand template to another site that is closer to the replication fork? The enzyme does this by “hitching a ride” with the DNA polymerase that is moving in that direction along the leading-strand template. Thus even though the two polymerases are moving in opposite directions with respect to the linear axis of the DNA molecule, they are, in fact, part of a single protein complex (Figure 13.13). The two tethered polymerases can replicate both strands by looping the DNA of the lagging-strand template back on itself, causing this template to have the same orientation as the leading-strand template. Both polymerases then can move together as part of a single replica-

tive complex without violating the “5⬘ → 3⬘ rule” for synthesis of a DNA strand (Figure 13.13). Once the polymerase assembling the lagging strand reaches the 5⬘ end of the Okazaki fragment synthesized during the previous round, the lagging-strand template is released and the polymerase begins work at the 3⬘ end of the next RNA primer toward the fork. The model depicted in Figure 13.13 is often referred to as the “trombone model” because the looping DNA repeatedly grows and collapses during the replication of the lagging strand, reminiscent of the movement of the brass “loop” of a trombone as it is played.3

The Structure and Functions of DNA Polymerases DNA polymerase III, the enzyme that synthesizes DNA strands during replication in E. coli, is part of a large “replication machine” called the DNA polymerase III holoenzyme (Figure 13.14). One of the noncatalytic components of the holoenzyme, called the ␤ clamp, keeps the polymerase associated with the DNA template. DNA polymerases (like RNA polymerases) possess two somewhat contrasting properties: (1) they must remain associated with the template over long stretches if they are to synthesize a continuous complementary strand, and (2) they must be attached loosely enough to the template to move from one nucleotide to the next. These contrasting properties are provided by the doughnut-shaped ␤ clamp that encircles the DNA (Figure 13.15a) and slides along 3

Recent studies strongly suggest that the replication complex, or replisome, contains three copies of DNA polymerase III rather than the traditional model of two copies that is presented here. According to this revised model, two of the three enzymes act on the lagging strand and one acts on the leading strand. The role of the additional polymerase has not been clearly defined, but it may serve primarily to continue replication if synthesis of an Okazaki fragment is interrupted. The manner in which the activities of the two lagging-strand polymerases is coordinated is unclear, but the mechanism of lagging-strand synthesis remains as discussed in this section.

β subunit

555

DNA polymerase III SSB Lagging-strand template

Leading-strand template

Unreplicated parental DNA

5'

5' 3'

3'

Leading strand

DNA Helicase 5′ 3′

5' RNA primer #2

(a)

RNA primer #1

Growing Okazaki fragment of lagging strand

Lagging strand

5' 3'

5' 3' 3' OH 5' RNA primer RNA primer #2 #3 5'

Polymerase released from template

3' 5'

Completed Okazaki fragment

RNA primer #1 to be replaced with DNA by DNA Polymerase I; nick sealed by DNA ligase

(b)

5'

5' 3'

3'

RNA primer #2 RNA primer #3

5'

(c)

Figure 13.13 Replication of the leading and lagging strands in E. coli is accomplished by two DNA polymerases working together as part of a single complex. (a) The two DNA polymerase III molecules travel together, even though they are moving toward the opposite ends of their respective templates. This is accomplished by causing the lagging-strand template to form a loop. (b) The polymerase releases the lagging-strand template when it encounters the previously synthesized

Newly initiated Okazaki fragment

3'

5' Old Okazaki fragment

5'

Okazaki fragment. (c) The polymerase that was involved in the assembly of the previous Okazaki fragment has now rebound the laggingstrand template farther along its length and is synthesizing DNA onto the end of RNA primer #3 that has just been constructed by the primase. (FROM D. VOET AND J. G. VOET, BIOCHEMISTRY, 2D ED.; COPYRIGHT © 1995, JOHN WILEY AND SONS, INC. REPRINTED BY PERMISSION OF JOHN WILEY AND SONS, INC.) β clamp Leading strand

τ τ γ-clamp loader (ready to load β clamp for next Okazaki fragment) Lagging strand

DNA helicase

13.1 DNA Replication

Figure 13.14 Schematic representation of DNA polymerase III holoenzyme. The holoenzyme contains ten different subunits organized into several distinct components. Included as part of the holoenzyme are (1) two core polymerases which replicate the DNA, (2) two or more ␤ clamps, which allow the polymerase to remain associated with the DNA, and (3) a clamp loading (␥) complex, which loads each sliding clamp onto the DNA. The clamp loader of an active replication fork contains two ␶ subunits, which hold the core polymerases in the complex and also bind the helicase. Another term, the replisome, is often used to refer to the entire complex of proteins that is active at the replication fork, including the DNA polymerase III holoenzyme, the helicase, SSBs, and primase. (If the bacterial replisome contains three DNA polymerase III molecules, as discussed in footnote 3, page 554, then it would also contain three ␶ subunits to hold all the proteins in a complex, as shown in Science 335:329, 2012). (BASED ON DRAWINGS BY M. O’DONNELL.)

Core polymerase

556 5' 3'

5' 3'

β clamp

β clamp

Polymerase III

(a)

Chapter 13 DNA Replication and Repair

Figure 13.15 The ␤ sliding clamp and clamp loader. (a) Space-filling model showing the two subunits that make up the doughnut-shaped ␤ sliding clamp in E. coli. Double-stranded DNA is shown in blue within the ␤ clamp. (b) Schematic diagram of polymerase cycling on the lagging strand. The polymerase is held to the DNA by the ␤ sliding clamp as it moves along the template strand and synthesizes the complementary strand. Following completion of the Okazaki fragment, the enzyme disengages from its ␤ clamp and cycles to a recently assembled clamp “waiting” at an upstream RNA primer–DNA template junction. The original ␤ clamp is left behind for a period on the finished Okazaki fragment, but it is eventually disassembled and reutilized. (c) A model of a complex between a sliding clamp and a clamp loader from an archaean prokaryote based on electron microscopic image analysis. The clamp loader (shown with red and green subunits) is bound to the sliding clamp (blue), which is held in an open, spiral conformation resembling a lock-washer. The DNA has squeezed through the gap in the clamp. The primer strand of the DNA terminates within the clamp loader whereas the template strand extends through an opening at the top of the protein. The clamp loader has been described as a “screw-cap” that fits onto the DNA in such a way that the subdomains of the protein form a spiral that can thread onto the helical DNA backbone. The clamp-loading reaction is shown in detail in Science 334:1675, 2011. (A: FROM JOHN KURIYAN, CELL 69:427, 1992; WITH PERMISSION FROM ELSEVIER; B: FROM P. T. STUKENBERG, J. TURNER, AND M O'DONNELL, CELL 78:878, 1994;

it. As long as it is attached to a ␤ “sliding clamp,” a DNA polymerase can move processively from one nucleotide to the next without diffusing away from the template. The polymerase on the leading-strand template remains tethered to a single ␤ clamp during replication. In contrast, when the polymerase on the lagging-strand template completes the synthesis of an Okazaki fragment, it disengages from the ␤ clamp and is cycled to a new ␤ clamp that has been assembled at an RNA primer–DNA template junction located closer to the replication fork (Figure 13.15b). But how does a highly elongated DNA molecule get inside of a ring-shaped clamp as in Figure 13.15a? The assembly of the ␤ clamp around the DNA requires a multisubunit clamp loader that is also part of the DNA polymerase III holoenzyme (Figures 13.14, 13.15c). In the ATP-bound state, the clamp loader binds to a primer-template junction while holding the ␤ clamp in an open conformation as illustrated in Figure 13.15c. Once the DNA has squeezed

Leading strand template

Lagging strand template

(b)

Previously synthesized Okazaki fragment

(c)

CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER; C: FROM T. MIYATA, ET AL., PROC. NAT’L. ACAD. SCI. U.S.A. 102:13799, 2005, FIG. 4C. © 2005 NATIONAL ACADEMY OF SCIENCES, U.S.A. IMAGE PROVIDED COURTESY OF K. MORIKAWA, OSAKA, JAPAN.)

through the opening in the clamp wall, the ATP bound to the clamp loader is hydrolyzed, causing the release of the clamp, which closes around the DNA. The ␤ clamp is then ready to bind polymerase III as depicted in Figure 13.15b. DNA polymerase I, which consists of only a single subunit, is involved primarily in DNA repair, a process by which damaged sections of DNA are corrected (page 564). DNA polymerase I also removes the RNA primers at the 5⬘ end of each Okazaki fragment during replication and replaces them with DNA, as described in Figure 13.13b. The enzyme’s ability to accomplish this feat is discussed in the following section. Exonuclease Activities of DNA Polymerases Now that we have explained several of the puzzling properties of DNA polymerase I, such as the enzyme’s inability to initiate strand synthesis, we can consider another curious observation. Kornberg found that DNA polymerase I preparations always

557

contained exonuclease activities; that is, they were able to degrade DNA polymers by removing one or more nucleotides from the end of the molecule. At first, Kornberg assumed this activity was due to a contaminating enzyme because the action of exonucleases is so dramatically opposed to that of DNA synthesis. Nonetheless, the exonuclease activity could not be removed from the polymerase preparation and was, in fact, a true activity of the polymerase molecule. It was subsequently shown that all of the bacterial DNA polymerases possess exonuclease activity. Exonucleases can be divided into 5⬘ → 3⬘ and 3⬘ → 5⬘ exonucleases, depending on the direction in which the strand is degraded. DNA polymerase I has both 3⬘ → 5⬘ and 5⬘ → 3⬘ exonuclease activities, in addition to its polymerizing activity (Figure 13.16). These three activities are found in different domains of the single polypeptide. Thus, remarkably, DNA polymerase I is three different enzymes in one. The two exonuclease activities have entirely different roles in replication. We will consider the 5⬘ → 3⬘ exonuclease activity first. Most nucleases are specific for either DNA or RNA, but the 5⬘ → 3⬘ exonuclease of DNA polymerase I can degrade either type of nucleic acid. Initiation of Okazaki fragments by the primase leaves a stretch of RNA at the 5⬘ end of each fragment (see RNA primer #1 of Figure 13.13b), which is removed by the 5⬘ → 3⬘ exonuclease activity of DNA

5' 3' Exonuclease hydrolysis site

5'

5'

C

T

G

C A G

G

T

A

G X

C 3'

A

T

A A

G

C A

C

G T

C

T

T

A

3'

Single-strand nick

polymerase I (Figure 13.16a). As the enzyme removes ribonucleotides of the primer, its polymerase activity simultaneously fills the resulting gap with deoxyribonucleotides. The last deoxyribonucleotide incorporated is subsequently joined covalently to the 5⬘ end of the previously synthesized DNA fragment by DNA ligase. The role of the 3⬘ → 5⬘ exonuclease activity will be apparent in the following section. Ensuring High Fidelity during DNA Replication The survival of an organism depends on the accurate duplication of the genome. A mistake made in the synthesis of a messenger RNA molecule by an RNA polymerase results in the synthesis of defective proteins, but an mRNA molecule is only one short-lived template among a large population of such molecules; therefore, little lasting damage results from the mistake. In contrast, a mistake made during DNA replication results in a permanent mutation and the possible elimination of that cell’s progeny. In E. coli, the chance that an incorrect nucleotide will be incorporated into DNA during replication and remain there is less than 10⫺9, or fewer than 1 out of 1 billion nucleotides. Because the genome of E. coli contains approximately 4 ⫻ 106 nucleotide pairs, this error rate corresponds to fewer than 1 nucleotide alteration for every 100 replication cycles. This represents the spontaneous mutation rate in this bacterium. Humans are thought to have a similar spontaneous mutation rate for replication of protein-coding sequences. Incorporation of a particular nucleotide onto the end of a growing strand depends on the incoming nucleoside triphosphate being able to form an acceptable base pair with the nucleotide of the template strand (see Figure 13.8b). Analysis of the distances between atoms and bond angles indicates that A-T and G-C base pairs have nearly identical geometry (i.e., size and shape). Any deviation from those pairings results in a different geometry, as shown in Figure 13.17. At each site along the template, DNA polymerase must discriminate O

CH3 Mismatched bases

3'

A

A

G

T C

C G

T

G

G C

A

C

T

N A

N H

C 1'

O

H

C

N C 1' C 1'

H N

N

T

H

C

3'

52˚

51˚

54˚

O

O

CH3 H H N H N

T

A

46˚ 10.3

N N C 1'

N

68˚

O

N H

N C 1'

O 69˚

H N G

N

N

H N H

C 1' 42˚

10.3

Figure 13.17 Geometry of proper (top row) and mismatched (bottom row) base pairs. (FROM ANNUAL REVIEW OF BIOCHEMISTRY. VOL. 60 BY RICHARDSON, CHARLES REPRODUCED WITH PERMISSION OF ANNUAL REVIEWS, INCORPORATED IN THE FORMAT REPUBLISH IN A BOOK VIA COPYRIGHT CLEARANCE CENTER.)

13.1 DNA Replication

Figure 13.16 The exonuclease activities of DNA polymerase I. (a) The 5⬘ → 3⬘ exonuclease function removes nucleotides from the 5⬘ end of a single-strand nick. This activity also plays a key role in removing the RNA primers. (b) The 3⬘ → 5⬘ exonuclease function removes mispaired nucleotides from the 3⬘ end of the growing DNA strand. This activity plays a key role in maintaining the accuracy of DNA synthesis. (FROM D. VOET AND J. G. VOET, BIOCHEMISTRY, 2D ED.; COPYRIGHT © 1995, JOHN WILEY AND SONS, INC. REPRINTED BY PERMISSION OF JOHN WILEY AND SONS, INC.)

N C 1'

C 1'

10.8

N H N

N

N H

5' 3' 5' Exonuclease hydrolysis site

G

H N

O

11.1

G G

O N

N

N

50˚

C

N H

H N N

T

5'

G T

A T

C

C

A T

(b)

H

H

(a)

Chapter 13 DNA Replication and Repair

558

among four different potential precursors as they move in and out of the active site. Among the four possible incoming nucleoside triphosphates, only one forms a proper geometric fit with the template, producing either an A-T or a G-C base pair that can fit into a binding pocket within the enzyme. This is only the first step in the discrimination process. If the incoming nucleotide is “perceived” by the enzyme as being correct, a conformational change occurs in which the “fingers” of the polymerase rotate toward the “palm” (Figure 13.18a), gripping the incoming nucleotide. This is an example of an induced fit as discussed on page 100. If the newly formed base pair exhibits improper geometry, the active site cannot achieve the conformation required for catalysis and the incorrect nucleotide is not incorporated. In contrast, if the base pair exhibits proper geometry, the incoming nucleotide is covalently linked to the end of the growing strand. On occasion, the polymerase incorporates an incorrect nucleotide, resulting in a mismatched base pair, that is, a base pair other than A-T or G-C. It is estimated that an incorrect pairing of this sort occurs once for every 105–106 nucleotides incorporated, a frequency that is 103–104 times greater than the spontaneous mutation rate of approximately 10⫺9. How is the mutation rate kept so low? Part of the answer lies in the second of the two exonuclease activities mentioned above, the 3⬘ → 5⬘ activity (Figure 13.16b). When an incorrect nucleotide is incorporated by DNA polymerase I, the enzyme stalls and the end of the newly synthesized strand has an increased tendency to separate from the template and form a single-stranded 3⬘ terminus. When this occurs, the frayed end of the newly synthesized strand is directed into the 3⬘ → 5⬘ exonuclease site (Figure 13.18), which removes the mismatched nucleotide. This job of “proofreading” is one of the most remarkable of all enzymatic activities and illustrates the sophistication to which biological molecular machinery has evolved. The 3⬘ → 5⬘ exonuclease activity removes approximately 99 out of every 100 mismatched bases, raising the fidelity to about 10⫺7–10⫺8. In addition, bacteria possess a mechanism called mismatch repair that operates after replication (page 567) and corrects nearly all of the mismatches that escape the proofreading step. Together these processes reduce the overall observed error rate to about 10⫺9. Thus the fidelity of DNA replication can be traced to three distinct activities: (1) accurate selection of nucleotides, (2) immediate proofreading, and (3) postreplicative mismatch repair. Another remarkable feature of bacterial replication is its rate. The replication of an entire bacterial chromosome in approximately 40 minutes at 37⬚C requires that each replication fork move about 1000 nucleotides per second, which is equivalent to the length of an entire Okazaki fragment. Thus the entire process of Okazaki fragment synthesis, including formation of an RNA primer, DNA elongation and simultaneous proofreading by the DNA polymerase, excision of the RNA, its replacement with DNA, and strand ligation, occurs within a few seconds. Although it takes E. coli approximately 40 minutes to replicate its DNA, a new round of replication can begin before the previous round has been completed. Consequently, when these bacteria are growing at their maximal rate, they double their numbers in about 20 minutes.

Fingers Palm

Thumb Newly synthesized strand

3' OH

3'

3'

5'

5'

Template strand Mismatched base 3' 5' Exonuclease site (a)

Mismatched base to be removed

(b)

Figure 13.18 Activation of the 3⬘ → 5⬘ exonuclease of DNA polymerase I. (a) A schematic model of a portion of DNA polymerase I known as the Klenow fragment, which contains the polymerase and 3⬘ → 5⬘ exonuclease active sites. The 5⬘ → 3⬘ exonuclease activity is located in a different portion of the polypeptide, which is not shown here. The regions of the Klenow fragment are often likened to the shape of a partially opened right hand, hence the portions labeled as “fingers,” “palm,” and “thumb.” The catalytic site for polymerization is located in the central “palm” subdomain. The 3⬘ terminus of the growing strand can be shuttled between the polymerase and exonuclease active sites. Addition of a mismatched base to the end of the growing strand produces a frayed (single-stranded) 3⬘ end that enters the exonuclease site, where it is removed. (The polymerase and exonuclease sites of polymerase III operate similarly but are located on different subunits.) (b) A molecular model of the Klenow fragment complexed to DNA. The template DNA strand being copied is shown in blue, and the primer strand to which the next nucleotides would be added is shown in red. (A: AFTER T. A. BAKER AND S. P. BELL, CELL 92:296, 1998, AFTER A DRAWING BY C. M. JOYCE AND T. A. STEITZ. CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER; B: COURTESY OF THOMAS A. STEITZ, YALE UNIVERSITY.)

Replication in Eukaryotic Cells As noted in Chapter 10, the nucleotide letters of the human genome sequence would fill a book roughly one million pages in length. While it took several years for hundreds of researchers to initially sequence the human genome, a single cell

559

nucleus of approximately 10 ␮m diameter can copy all of this DNA within a few hours. Given the fact that eukaryotic cells have large genomes and complex chromosomal structure, our understanding of replication in eukaryotes has lagged behind that in bacteria. This imbalance has been addressed by the development of eukaryotic experimental systems that parallel those used for decades to study bacterial replication. These include ■

■

■

The isolation of mutant yeast and animal cells unable to produce specific gene products required for various aspects of replication. Analysis of the structure and mechanism of action of replication proteins from archaeal species (as in Figure 13.15c). Replication in these prokaryotes begins at multiple origins and requires proteins that are homologous to those of eukaryotic cells but are less complex and easier to study. The development of in vitro systems where replication can occur in cellular extracts or mixtures of purified proteins. The most valuable of these systems has utilized Xenopus, an aquatic frog that begins life as a huge egg stocked with all of the proteins required to carry it through a dozen or so very rapid rounds of cell division. Extracts can be prepared from these frog eggs that will replicate any added DNA, regardless of sequence. Frog egg extracts will also support the replication and mitotic division of mammalian nuclei, which has made this a particularly useful cell-free system. Antibodies can be used to deplete the extracts of particular proteins, and the replication ability of the extract can then be tested in the absence of the affected protein.

presence of acetylated histones, which is closely correlated with gene transcription (page 527), is a likely factor in determining the early replication of active gene loci. The most highly compacted, least acetylated regions of the chromosome are packaged into heterochromatin (page 501), and they are the last regions to be replicated. This difference in timing of replication is not related to DNA sequence because the inactive, heterochromatic X chromosome in the cells of female mammals (page 498) is replicated late in S phase, whereas the active, euchromatic X chromosome is replicated at an earlier stage. Similarly, the ␤-globin locus replicates early during S phase in erythroid cells where the gene is expressed, but much later in nonerythroid cells where the gene remains silent. The mechanism by which replication is initiated in eukaryotes has been a focus of research over the past decade. The greatest progress in this area has been made with budding yeast because the origins of replication can be removed from the yeast chromosome and inserted into bacterial DNA molecules, conferring on them the ability to replicate either within a yeast cell or in cellular extracts containing the required eukaryotic replication proteins. Because these sequences promote replication of the DNA in which they are contained, they are referred to as autonomous replicating sequences (ARSs). Those ARSs that have been isolated and analyzed share several distinct elements. The core element of an ARS consists of a conserved sequence of 11 base pairs, which functions as a specific binding site for an essential multiprotein complex called the origin recognition complex (ORC) (see Figure 13.20). If the ARS is mutated so that it is unable to bind the ORC, initiation of replication cannot occur. Replication origins have proven more difficult to study in vertebrate cells than in yeast. Part of the problem stems from the fact that virtually any type of purified, naked DNA is suitable for replication using extracts from frog eggs. These studies suggested that, unlike yeast, vertebrate DNA does not possess specific sequences (e.g., ARSs) at which replication is

13.1 DNA Replication

Initiation of Replication in Eukaryotic Cells Replication in E. coli begins at only one site along the single, circular chromosome (Figure 13.5). Cells of higher organisms may have a thousand times as much DNA as this bacterium, yet their polymerases incorporate nucleotides into DNA at much slower rates. To accommodate these differences, eukaryotic cells replicate their genome in small portions, termed replicons. Each replicon has its own origin from which replication forks proceed outward in both directions (see Figure 13.23a). In a human cell, replication begins at about 10,000 to 100,000 different replication origins. The existence of replicons was first demonstrated in autoradiographic experiments in which single DNA molecules were shown to be replicated simultaneously at several sites along their length (Figure 13.19). Approximately 10 to 15 percent of replicons are actively engaged in replication at any given time during the S phase of the cell cycle (see Figure 14.1). Replicons located close together in a given chromosome undergo replication simultaneously (as evident in Figure 13.19). Moreover, those replicons active at a particular time during one round of DNA synthesis tend to be active at a comparable time in succeeding rounds. In mammalian cells, the timing of replication of a chromosomal region is roughly correlated with the activity of the genes in the region and/or its state of compaction. The

Figure 13.19 Experimental demonstration that replication in eukaryotic chromosomes begins at many sites along the DNA. Cells were incubated in [3H] thymidine for a brief period before preparation of DNA fibers for autoradiography. The lines of black silver grains indicate sites that had incorporated the radioactive DNA precursor during the labeling period. It is evident that synthesis is occurring at separated sites along the same DNA molecule. As indicated in the accompanying line drawing, initiation begins in the center of each site of thymidine incorporation, forming two replication forks that travel away from each other until they meet a neighboring fork. (COURTESY OF JOEL HUBERMAN.)

560

initiated. However, studies of replication of intact mammalian chromosomes in vivo suggest that replication does begin within defined regions of the DNA, rather than by random selection as occurs in the amphibian egg extract. It is thought that a DNA molecule contains many sites where DNA replication can be initiated, but only a subset of these potential sites are actually used at a given time in a given cell. Cells that reproduce via shorter cell cycles, such as those of early amphibian embryos, utilize a greater number of sites as origins of replication than cells with longer cell cycles. The actual selection of sites for initiation of replication is thought to be governed by local epigenetic factors (page 509), such as the positions of nucleosomes, the types of histone modifications, the state of DNA methylation, the degree of supercoiling, and the level of transcription. Restricting Replication to Once Per Cell Cycle It is essential that each portion of the genome is replicated once, and only once, during each cell cycle. Consequently, some mechanism must exist to prevent the reinitiation of replication at a site that has already been duplicated. The initiation of replication at a particular origin requires passage of the origin through several distinct states. Some of the steps that occur at an origin of replication in a yeast cell are illustrated in Figure 13.20. Similar steps requiring homologous proteins take place in plants and animals, indicating that the basic mechanism of initiation of replication is conserved among eukaryotes.

ORC has been described as a “molecular landing pad” because of its role in binding the proteins required in subsequent steps. 2. The next major step (step 2) is the assembly of a protein– DNA complex, called the prereplication complex (pre-RC), that is “licensed” (competent) to initiate replication. Studies of the formation of the pre-RC have focused on a set of six related MCM proteins (Mcm2-Mcm7). The MCM proteins are loaded onto the replication origin at a late stage of mitosis, or soon thereafter, with the aid of accessory factors that have previously bound to the ORC. The six Mcm2–Mcm7 proteins interact with one another

1

ORC Mcm2-Mcm7 proteins (MCM complex) bind soon after mitosis

Chapter 13 DNA Replication and Repair

Figure 13.20 Steps leading to the replication of a yeast replicon. Yeast origins of replication contain a conserved sequence (ARS) that binds the multisubunit origin recognition complex (ORC) (step 1). The presence of the bound ORC is required for initiation of replication. The ORC is bound to the origin throughout the yeast cell cycle. In step 2, a complex of six proteins (Mcm2–Mcm7) binds to the origin during or following mitosis, establishing a prereplication complex (pre-RC) that is competent to initiate replication, given the proper stimulus. Loading of MCM proteins at the origin requires additional proteins (Cdc6 and Cdt1, not shown). In step 3, DNA replication is initiated following activation of a cyclin-dependent kinase (Cdk) and a second protein kinase (DDK). Step 4 shows a stage where replication has proceeded a short distance in both directions from the origin. Each MCM complex forms a replicative DNA helicase that unwinds DNA at one of the oppositely directed replication forks. The other proteins required for replication are not shown in this illustration but are indicated in the next figure. In step 5, the two strands of the original duplex have been replicated, an ORC is present at both origins, and the replication proteins, including the MCM helicases, have been displaced from the DNA. In yeast, the MCM proteins are exported from the nucleus, and reinitiation of replication cannot occur until the cell has passed through mitosis. [In vertebrate cells, several events appear to prevent reinitiation of replication, including (1) release of the ORC complex after its use in S phase, (2) continued Cdk activity from S phase into mitosis, (3) phosphorylation of Cdc6 and its subsequent export from the nucleus, and (4) degradation of Cdt1 or its inactivation by a bound inhibitor.]

Requires Cdc6 and Cdt1 Mcm2-Mcm7

Prereplication complex 2

Protein kinases (Cdk and DDK) phosphorylate and activate pre-RC complex at beginning of S phase. Several other factors are also required.

1. In step 1 (Figure 13.20), the origin of replication is bound

by an ORC protein complex, which in yeast cells remains associated with the origin throughout the cell cycle. The

ARS

3

Initiation of replication Active Mcm2-Mcm7 replicative helicase

Newly synthesized DNA strand 4

ORC 5

+

MCM proteins

Nuclear envelope

Daughter DNAs

561

to form a hexameric (6-membered) ring-shaped complex (the MCM complex) that possesses helicase activity (as in step 4, Figure 13.20). Evidence strongly suggests that the MCM complex is the eukaryotic replicative helicase; that is, the helicase responsible for unwinding DNA at the replication fork (analogous to DnaB in E. coli). At the pre-RC stage shown in step 2, each of the origins contains a double hexameric MCM complex, that is, two complete replicative helicases, which remain inactive at this stage of the cell cycle. Each of these replicases will travel in opposite directions away from the origin once replication begins. 3. The assembly of a pre-RC marks that site on the genome

as a potential origin of replication, but does not guarantee that it will actually be a site where replication will be initiated. During most cell cycles, many more pre-RCs are assembled than will be used and it is not clear what determines which of these potential sites of replication are subsequently selected. Regardless of the selection mechanism, just before the beginning of S phase of the cell cycle, the activation of key protein kinases leads to the phosphorylation of the MCM complex and other proteins and to the initiation of replication at selected sites in the genome (step 3, Figure 13.20). One of these protein kinases is a cyclin-dependent kinase (Cdk) whose function is discussed at length in Chapter 14. Cdk activity remains high from S phase through mitosis, which suppresses the formation of new prereplication complexes. Consequently, each origin can only be activated once per cell cycle. Cessation of Cdk activity at the end of mitosis permits the assembly pre-RCs for the next cell cycle. 4. Once replication is initiated at the beginning of S phase,

the MCM helicase moves with the replication fork (step 4), although the mechanism of action of this ring-shaped protein is debated. The fate of the MCM proteins after replication depends on the species studied. In mammalian cells, the MCM proteins are displaced from the DNA but

apparently remain in the nucleus. Regardless, MCM proteins cannot reassociate with an origin of replication that has already “fired.” The Eukaryotic Replication Fork Overall, the activities that occur at replication forks are quite similar, regardless of the type of genome being replicated—whether viral, bacterial, archaeal, or eukaryotic. The various proteins in the replication “tool kit” of eukaryotic cells are listed in Table 13.1 and depicted in Figure 13.21. All replication systems require helicases, single-stranded DNA-binding proteins, topoisomerases, primase, DNA polymerase, sliding clamp and clamp loader, and DNA ligase. When studying the initiation of eukaryotic replication in vitro, researchers often combine mammalian replication proteins with a viral helicase called the large T antigen, which is encoded by the SV40 genome. The large T antigen induces strand separation at the SV40 origin of replication and unwinds the DNA as the replication fork progresses (as in Figure 13.21a). As in bacteria, the DNA of eukaryotic cells is synthesized in a semidiscontinuous manner, although the Okazaki fragments of the lagging strand are considerably smaller than in bacteria, averaging about 150 nucleotides in length. As in E. coli, the leading and lagging strands are thought to be synthesized in a coordinate manner by a single replicative complex, or replisome (Figure 13.21b). To date, five “classic” DNA polymerases have been isolated from eukaryotic cells, and they are designated ␣, ␤, ␥, ␦, and e. Of these enzymes, polymerase ␥ replicates mitochondrial DNA, and polymerase ␤ functions in DNA repair. The other three polymerases have replicative functions. Polymerase ␣ is tightly associated with the primase, and together they initiate the synthesis of each Okazaki fragment. The polymerase ␣-primase complex recognizes and binds to unwound as DNA that is coated by a single-stranded DNA-binding protein called RPA. Primase initiates synthesis by assembly of a short RNA primer, which is then extended by the addition of about 20 deoxyribonucleotides by polymerase ␣. Polymerase ␦ is thought to

Table 13.1 Some of the Proteins Required for Replication Eukaryotic protein

DnaA Gyrase DnaB DnaC SSB ␥-complex pol III core

ORC proteins Topoisomerase I/II Mcm Cdc6, Cdt1 RPA RFC pol ␦/e

␤ clamp

PCNA

Primase ——— DNA ligase pol I

Primase pol ␣ DNA ligase FEN-1

Function

Recognition of origin of replication Relieves positive supercoils ahead of replication fork DNA helicase that unwinds parental duplex Loads helicase onto DNA Maintains DNA in single-stranded state Subunits of the DNA polymerase holoenzyme that load the clamp onto the DNA Primary replicating enzymes; synthesize entire leading strand and Okazaki fragments; have proofreading capability Ring-shaped subunit of DNA polymerase holoenzyme that clamps replicating polymerase to DNA; works with pol III in E. coli and pol ␦ or e in eukaryotes Synthesizes RNA primers Synthesizes short DNA oligonucleotides as part of RNA–DNA primer Seals Okazaki fragments into continuous strand Removes RNA primers; pol I of E. coli also fills gap with DNA

13.1 DNA Replication

E. coli protein

562

PCNA Pol ε

PCNA Pol ε RPA

Leading-strand template RFC RNA primer Pol α DNA Ligase

PCNA RFC

Helicase (T antigen) Pol δ Topoisomerase

Topoisomerase

Primase RPA

FEN-1 Pol δ

Lagging-strand template

(a)

Chapter 13 DNA Replication and Repair

RFC

Helicase (T antigen)

Primase (b)

Figure 13.21 A schematic view of the major components at the eukaryotic replication fork. (a) The proteins required for eukaryotic replication. The viral T antigen is drawn as the replicative helicase in this figure because it is prominently employed in in vitro studies of DNA replication. DNA polymerases ␦ and e are thought to be the primary DNA synthesizing enzymes of the lagging and leading strands, respectively. PCNA acts as a sliding clamp for both polymerases ␦ and e. The sliding clamp is loaded onto the DNA by a protein called RFC (replication factor C), which is similar in structure and function to the ␥-clamp loader of E. coli. RPA is a trimeric single-stranded DNAbinding protein comparable in function to that of SSB utilized in

E. coli replication. The RNA-DNA primers of the lagging strand that are synthesized by the polymerase ␣-primase complex are displaced by the continued movement of polymerase ␦, generating a flap of RNADNA that is removed by the FEN-1 endonuclease. The gap is sealed by a DNA ligase. As in E. coli, a topoisomerase is required to remove the positive supercoils that develop ahead of the replication fork. (b) A proposed version of events at the replication fork illustrating how the replicative polymerases on the leading- and lagging-strand templates might act together as part of a replisome. To date there is no firm evidence that the leading and lagging strands are replicated by a single replicative complex as in E. coli.

be the primary DNA-synthesizing enzyme during replication of the lagging strand, whereas polymerase e is thought to be the primary DNA-synthesizing enzyme during replication of the leading strand. Like the major replicating enzyme of E. coli, both polymerase ␦ and e require a “sliding clamp” that tethers the enzyme to the DNA, allowing it to move processively along a template. The sliding clamp of eukaryotic cells is very similar in structure and function to the ␤ clamp of E. coli polymerase III illustrated in Figure 13.15. In eukaryotes, the sliding clamp is called PCNA. The clamp loader that loads PCNA onto the DNA is called RFC and is analogous to the E. coli polymerase III clamp loader complex. After synthesizing an RNA-DNA primer, polymerase ␣ is replaced at each template–primer junction by the PCNA–polymerase ␦ complex, which completes synthesis of the Okazaki fragment. When polymerase ␦ reaches the 5⬘ end of the previously synthesized Okazaki fragment, the polymerase continues along the lagging-strand template, displacing the primer (shown as a green flap in Figure 13.21a). The displaced primer is cut from the newly synthesized DNA strand by an endonuclease (FEN-1) and the resulting nick in the DNA is sealed by a DNA ligase. FEN-1 and DNA ligase are thought to be recruited to the replication fork through an interaction with the PCNA sliding clamp. In fact, PCNA is thought to play a major role in orchestrating events that occur during DNA replication, repair, and recombination. Because of its ability to bind a diverse array of proteins, PCNA has been referred to as a “molecular toolbelt.” Like bacterial polymerases, all of the eukaryotic polymerases elongate DNA strands in the 5⬘ → 3⬘ direction by the addition of nucleotides to a 3⬘ hydroxyl group, and none of them is able to initiate the synthesis of a DNA chain without

a primer. Polymerases ␥, ␦, and e possess a 3⬘ → 5⬘ exonuclease, whose proofreading activity ensures that replication occurs with very high accuracy. Several other DNA polymerases (including ␩, ␬ and ␫) have a specialized function that allows cells to replicate damaged DNA as described on page 569. Replication and Nuclear Structure Replication forks that are active at a given time are not distributed randomly throughout the cell nucleus, but instead are localized within 50 to 250 sites, called replication foci (Figure 13.22). It is estimated that each of the bright red regions indicated in Figure 13.22 contains approximately 10 to 100 replication forks incorporating nucleotides into DNA strands simultaneously. The clustering of replication forks may provide a mechanism for coordinating the replication of adjacent replicons on individual chromosomes (as in Figure 13.19). Chromatin Structure and Replication The chromosomes of eukaryotic cells consist of DNA tightly complexed to regular arrays of histone proteins that are present in the form of nucleosomes (page 494). Movement of the replication machinery along the DNA is thought to displace nucleosomes that reside in its path. Yet, examination of a replicating DNA molecule with the electron microscope reveals nucleosomes on both daughter duplexes very near the replication fork (Figure 13.23a), indicating that the reassembly of nucleosomes is a very rapid event. Collectively, the nucleosomes that form during the replication process are comprised of a roughly equivalent mixture of histone molecules that are inherited from parental chromosomes and histone molecules that have been newly synthe-

563 Figure 13.22 Demonstration that replication activities do not occur randomly throughout the nucleus but are confined to distinct sites. Prior to the onset of DNA synthesis at the start of S phase, various factors required for the initiation of replication are assembled at discrete sites within the nucleus, forming prereplication centers. These sites appear as discrete red objects in the micrograph, which has been stained with a fluorescent antibody against replication factor A (RPA), which is a single-stranded DNA-binding protein required for replication. Other replication factors, such as PCNA and the polymerase–primase complex, are also localized to these foci. (FROM YASUHISA ADACHI AND ULRICH K. LAEMMLI, EMBO J. VOL. 13, COVER NO. 17, 1994. © 1994, REPRINTED BY PERMISSION OF MACMILLAN PUBLISHERS LTD.)

sized. Recall from page 495 that the core histone octamer of a nucleosome consists of an (H3H4)2 tetramer together with a pair of H2A/H2B dimers. The way in which parental nucleosomes are distributed during replication has been an area of recent debate. According to results from classic experiments, the (H3H4)2 tetramers present prior to replication remain intact and are distributed randomly between the two daughter duplexes. As a result, old and new (H3H4)2 tetramers are thought to be intermixed on each daughter DNA molecule as indicated in the model shown in Figure 13.23b. According to

this model, the two H2A/H2B dimers of each parental nucleosome fail to remain together as the replication fork moves through the chromatin. Instead, the H2A/H2B dimers of a nucleosome separate from one another and bind randomly to the new and old (H3H4)2 tetramers already present on the daughter duplexes (Figure 13.23b). Results of several recent

5' 3'

3' 5' (H3H4)2 Nucleosome

H2A/H2B H2A/H2B

(b)

(a)

?

13.1 DNA Replication

Figure 13.23 The distribution of histone core complexes to daughter strands following replication. (a) Electron micrograph of chromatin isolated from the nucleus of a rapidly cleaving Drosophila embryo showing a pair of replication forks (arrows) moving away from each other in opposite directions. Between the two forks one sees regions of newly replicated DNA that are already covered by nucleosomal core particles to the same approximate density as the parental strands that have not yet undergone replication. (b) Schematic model showing the distribution of core histones after DNA replication. Each nucleosome core particle is shown schematically to be composed of a central (H3H4)2 tetramer flanked by two H2A/H2B dimers. Histones that were present in parental nucleosomes prior to replication are indicated in blue; newly synthesized histones are indicated in red. According to this model, which is supported by a body of experimental evidence, the parental (H3H4)2 tetramers remain intact and are distributed randomly to both daughter duplexes. In contrast, the pairs of H2A/H2B dimers present in parental nucleosomes separate and recombine randomly with the (H3H4)2 tetramers on the daughter duplexes. Other models have been presented in which the parental (H3H4)2 tetramer is split in half by a histone chaperone, and the two resulting H3/H4 dimers are distributed to different DNA strands (discussed in Cell 140:183, 2010 and Nature Revs. Gen. 11:285, 2010.) (A: COURTESY OF STEVEN L. MCKNIGHT AND OSCAR L. MILLER, JR.)

Chapter 13 DNA Replication and Repair

564

experiments have raised the possibility of another model, one in which the (H3H4)2 tetramer from parental nucleosomes is split into two H3/H4 dimers, each of which may combine with a newly synthesized H3/H4 dimer to form a “mixed” (H3H4)2 tetramer, which then assembles with H2A/H2B dimers. Regardless of the pattern by which it occurs, the stepwise assembly of nucleosomes and their orderly spacing along the DNA is facilitated by a network of accessory proteins. Included among these proteins are a number of histone chaperones that are able to accept either newly synthesized or parental histones and transfer them to the daughter strands. The best studied of these histone chaperones, CAF-1, is recruited to the advancing replication fork through an interaction with the sliding clamp PCNA. For the most part, daughter cells carry out the same pattern of transcription as their parental cells; this is one of the cornerstones underlying the homeostatic functioning of tissues and organs. As discussed in Chapter 12, the transcriptional state of a cell depends to a large degree upon the epigenetic state of the cell’s chromatin, which is inherited from one cell generation to the next. Epigenetic information is not encoded within a chromosome’s DNA sequence, but rather is encoded in the pattern of methylated cytosine residues in a cell’s DNA and in the pattern of posttranslational modifications of the core histones associated with the DNA. Consequently, it is essential that these patterns be faithfully transmitted from parental chromatin to the chromatin of daughter cells, yet very little is known about how such transmission occurs. As noted on page 531, DNA methylation patterns are apparently transmitted through the activities of the DNA methyltransferase DNMT1. Somehow, this enzyme appears capable of adding methyl groups to the cytosine residues of newly synthesized DNA strands using the pattern of such modifications on the parental DNA strands as a guide or template. Transmission of histone modifications is likely a much more complex challenge for the cell owing to the fact that there are quite a number of different types of such modifications as well as a number of different histone residues that can be modified. If “old” and “new” histones are transmitted randomly from parental to daughter chromatin as depicted in Figure 13.23, then the transmission of histone modifications is likely to occur by a very different mechanism than the transmission of DNA methylation patterns. It is likely, for example, that modifications present on old histones will guide the modification of new histones within neighboring nucleosomes on the same DNA strand. In the scheme that is simplest to envision, each particular type of mark, such as a trimethylated H3K9 residue or an acetylated H4K12 residue, is part of a positive feedback loop that causes that modification to be copied to the histones of an adjacent nucleosome. This type of mechanism is basically similar to the one depicted in Figure 12.21, which describes a scheme for the spreading of heterochromatin. According to this scenario, a particular histone modification (e.g., H3K9me) would serve as a specific binding site for a protein (e.g., HP1) that would in turn recruit an enzyme (e.g., SUV39H1) that would catalyze the same modification (i.e., H3K9me) on an adjacent histone molecule. Whether these ideas will hold up to experimental scrutiny remains to be seen.

REVIEW 1. The original Watson-Crick proposal for DNA replication envisioned the continuous synthesis of DNA strands. How and why has this concept been modified over the intervening years? 2. What does it mean that replication is semiconservative? How was this feature of replication demonstrated in bacterial cells? in eukaryotic cells? 3. Why are there no heavy bands in the top three centrifuge tubes of Figure 13.3a ? 4. How is it possible to obtain mutants whose defects lie in genes that are required for an essential activity such as DNA replication? 5. Describe the events that occur at an origin of replication during the initiation of replication in yeast cells. What is meant by replication being bidirectional? 6. Why do the DNA molecules depicted in Figure 13.7a fail to stimulate the polymerization of nucleotides by DNA polymerase I? What are the properties of a DNA molecule that allow it to serve as a template for nucleotide incorporation by DNA polymerase I? 7. Describe the mechanism of action of DNA polymerases operating on the two template strands and the effect this has on the synthesis of the lagging versus the leading strand. 8. Contrast the role of DNA polymerases I and III in bacterial replication. 9. Describe the role of the DNA helicase, the SSBs, the ␤ clamp, the DNA gyrase, and the DNA ligase during replication in bacteria. 10. What is the consequence of having the DNA of the lagging-strand template looped back on itself as in Figure 13.13a ? 11. How do the two exonuclease activities of DNA polymerase I differ from one another? What are their respective roles in replication? 12. Describe the factors that contribute to the high fidelity of DNA replication. 13. What is the major difference between bacteria and eukaryotes that allows a eukaryotic cell to replicate its DNA in a reasonable amount of time?

13.2 | DNA Repair Life on Earth is subject to a relentless onslaught of destructive forces that originate in both the internal and external environments of an organism. Of all the molecules in a cell, DNA is placed in the most precarious position. On one hand, it is essential that the genetic information remain mostly unchanged as it is passed from cell to cell and individual to individual. On the other hand, DNA is one of the molecules in a cell that is most susceptible to environmental damage. When struck by ionizing radiation, the backbone of a DNA molecule is often

565 5'

3' OH O O O O O O O

chemical modifications or distortions of the DNA duplex. In some cases, damage can be repaired directly. Humans, for example, possess enzymes that can directly repair damage from cancer-producing alkylating agents. Most repair systems, however, require that a damaged section of the DNA be excised, that is, selectively removed. One of the great virtues of the DNA duplex is that each strand contains the information required for constructing its partner. Consequently, if one or more nucleotides is removed from one strand, the complementary strand can serve as a template for reconstruction of the duplex. The repair of DNA damage in eukaryotic cells is complicated by the relative inaccessibility of DNA within the folded chromatin fibers of the nucleus. As in the case of transcription, DNA repair involves the participation of chromatin-reshaping machines, such as the histone modifying enzymes and nucleosome remodeling complexes discussed on page 526-529. Although important in DNA repair, the roles of these proteins will not be considered in the following discussion.

O OH 3'

5'

Figure 13.24 A pyrimidine dimer that has formed within a DNA duplex following UV irradiation.

Nucleotide excision repair (NER) operates by a cut-andpatch mechanism that removes a variety of bulky lesions, including pyrimidine dimers and nucleotides to which various chemical groups have become attached. Two distinct NER pathways can be distinguished: 1. A transcription-coupled pathway in which the template

strands of genes that are being actively transcribed are preferentially repaired. Repair of a template strand is thought to occur as the DNA is being transcribed, and the presence of the lesion may be signaled by a stalled RNA polymerase. This preferential repair pathway ensures that those genes of greatest importance to the cell, which are the genes the cell is actively transcribing, receive the highest priority on the “repair list.” 2. A slower, less efficient global genomic pathway that corrects DNA strands in the remainder of the genome. Although recognition of the lesion is probably accomplished by different proteins in the two NER pathways (step 1, Figure 13.25), the steps that occur during repair of the lesion are thought to be very similar, as indicated in steps 2–6 of Figure 13.25. One of the key components of the NER repair machinery is TFIIH, a huge protein that also participates in the initiation of transcription. The discovery of the involvement of TFIIH established a crucial link between transcription and DNA repair, two processes that were previously assumed to be independent of one another (discussed in the Experimental Pathways, which can be accessed on the Web at www.wiley.com/college/karp). Included among the various subunits of TFIIH are two subunits (XPB and XPD) that possess helicase activity; these enzymes separate the two strands of the duplex (step 2, Figure 13.25) in preparation for removal of the lesion. The damaged strand is then cut on both sides of the lesion by a pair of endonucleases (step 3), and the segment of DNA between the incisions is released (step 4). Once excised, the gap is filled by a DNA polymerase (step 5), and the strand is sealed by DNA ligase (step 6).

13.2 DNA Repair

broken; when exposed to a variety of reactive chemicals, many of which are produced by a cell’s own metabolism, the bases of a DNA molecule may be altered structurally; when subjected to ultraviolet radiation, adjacent pyrimidines on a DNA strand have a tendency to interact with one another to form a covalent complex, that is, a dimer (Figure 13.24). Even the absorption of thermal energy generated by metabolism is sufficient to split adenine and guanine bases from their attachment to the sugars of the DNA backbone. The magnitude of these spontaneous alterations, or lesions, can be appreciated from the estimate that each cell of a warm-blooded mammal loses approximately 10,000 bases per day! Failure to repair such lesions produces permanent alterations, or mutations, in the DNA. If the mutation occurs in a cell destined to become a gamete, the genetic alteration may be passed on to the next generation. Mutations also have effects in somatic cells (i.e., cells that are not in the germ line): they can interfere with transcription and replication, lead to the malignant transformation of a cell, or speed the process by which an organism ages. Considering the potentially drastic consequences of alterations in DNA molecules and the high frequency at which they occur, it is essential that cells possess mechanisms for repairing DNA damage. In fact, cells have a bewildering arsenal of repair systems that correct virtually any type of damage to which a DNA molecule is vulnerable. It is estimated that less than one base change in a thousand escapes a cell’s repair systems. The existence of these systems provides an excellent example of the molecular mechanisms that maintain cellular homeostasis. The importance of DNA repair can be appreciated by examining the effects on humans that result from DNA repair deficiencies, a subject discussed in the Human Perspective on page 569. Both prokaryotic and eukaryotic cells possess a variety of proteins that patrol vast stretches of DNA, searching for subtle

Nucleotide Excision Repair

566 CSB

RNA polymerase

T=T

5'

P

P

P

P

P

P

3'

1

RNA

Transcription-coupled pathway

G

U

C

A

C

C

G

G

T

G

1

XPC T=T

3'

P

P

P

P

P

P

5'

Global pathway Uracil–DNA glycosylase 2 5'

T=T

P

P

P

P

G

P

P

C

A

C

G

T

G

3'

2 G

C

3'

3

P

P

P

P

P

P

5'

3'OH 5' P

3'OH 5'P

AP endonuclease

T=T

OH P 5'

P

P

P

G

P

P

C

A

C

G

T

G

3'

3

4

C

G

5' P

T

T=

3'OH

3'

P

P

P

P

P

P

5'

Phosphodiesterase activity of DNA polymerase β OH

5

DNA polymerase δ/ε

3'OH

5' P

5'

P

P

P

G

P

P

C

A

C

G

T

G

3'

4 C

6

3'

P

G

P

P

P

P

P

5'

Chapter 13 DNA Replication and Repair

Polymerase activity of DNA polymerase β

Figure 13.25 Nucleotide excision repair. The following steps are depicted in the drawing and discussed in the text: (1) damage recognition in the global pathway is mediated by an XPC-containing protein complex, whereas damage recognition in the transcription-coupled pathway is thought to be mediated by a stalled RNA polymerase in conjunction with a CSB protein; (2) DNA strand separation (by XPB and XPD proteins, two helicase subunits of TFIIH); (3) incision (by XPG on the 3⬘ side and the XPF–ERCC1 complex on the 5⬘ side); (4) excision, (5) DNA repair synthesis (by DNA polymerase ␦ and/or e); and (6) ligation (by DNA ligase I).

OH 5'

A separate excision repair system operates to remove altered nucleotides generated by reactive chemicals present in the diet or produced by metabolism. The steps in this repair pathway in eukaryotes, which is called base excision repair (BER), are shown in Figure 13.26. BER is initiated by a DNA glycosylase that recognizes the alteration (step 1, Figure 13.26) and re-

P

P

P

P

P

G

C

C

A

C

C

G

G

T

G

3'

5

3'

P

P

P

P

P

P

5'

DNA ligase III

5'

Base Excision Repair

P

P

P

P

P

P

P

G

C

C

A

C

C

G

G

T

G

3'

6

3'

P

P

P

P

P

P

5'

Figure 13.26 Base excision repair. The steps are described in the text. Other pathways for BER are known, and BER also has been shown to have distinct transcription-coupled and global repair pathways.

567

moves the altered base by cleavage of the glycosidic bond holding the base to the deoxyribose sugar (step 2). A number of different DNA glycosylases have been identified, each more-or-less specific for a particular type of altered base, including uracil (formed by the hydrolytic removal of the amino group of cytosine), 8-oxoguanine (caused by damage from oxygen free radicals, page 35), and 3-methyladenine (produced by transfer of a methyl group from a methyl donor, page 437). Structural studies of the DNA glycosylase that removes the highly mutagenic 8-oxoguanine (oxoG) indicate that this enzyme diffuses rapidly along the DNA “inspecting” each of the G-C base pairs within the DNA duplex (Figure 13.27, step 1). In step 2, the enzyme has come across an oxoG-C base pair. When this occurs, the enzyme inserts a specific amino acid side chain into the DNA helix, causing the nucleotide to rotate (“flip”) 180 degrees out of the DNA helix and into the body of the enzyme (step 2). If the nucleotide does, in fact, contain an oxoG, the base fits into the active site of the enzyme (step 3) and is cleaved from its associated sugar. In contrast, if the extruded nucleotide contains a normal guanine, which only differs in structure by two atoms from oxoG, it is unable to fit into the enzyme’s active site (step 4) and it is returned to its appropriate position within the stack of bases. Once an altered purine or pyrimidine is removed by a glycosylase, the “beheaded” deoxyribose phosphate remaining in the site is excised by the combined action of a specialized (AP) endonuclease and a DNA polymerase. AP endonuclease cleaves the DNA backbone (Figure 13.26, step 3) and a phosphodiesterase activity of polymerase ␤ removes the sugar– phosphate remnant that had been attached to the excised base (step 4). Polymerase ␤ then fills the gap by inserting a nucleotide complementary to the undamaged strand (step 5), and the strand is sealed by DNA ligase III (step 6). The fact that cytosine can be converted to uracil may explain why natural selection favored the use of thymine, rather than uracil, as a base in DNA, even though uracil was presumably present in RNA when it served as genetic material during the early evolution of life (page 454). If uracil had been retained as a DNA base, it would have caused difficulty for repair systems to distinguish between a uracil that “belonged” at a particular site and one that resulted from an alteration of cytosine.

1

G C

G C

It was noted earlier that cells can remove mismatched bases that are incorporated by the DNA polymerase and escape the enzyme’s proofreading exonuclease. This process is called mismatch repair (MMR). A mismatched base pair causes a distortion in the geometry of the double helix that can be recognized by a repair enzyme. But how does the enzyme “recognize” which member of the mismatched pair is the incorrect nucleotide? If it were to remove one of the nucleotides at random, it would make the wrong choice 50 percent of the time, creating a permanent mutation at that site. Thus, for a mismatch to be repaired after the DNA polymerase has moved past a site, it is important that the repair system distinguish the newly synthesized strand, which contains the incorrect nucleotide, from the parental strand, which contains the correct nucleotide. In E. coli, the two strands are distinguished by the presence of methylated adenosine residues on the parental strand. DNA methylation does not appear to be utilized by the MMR system in eukaryotes, and the mechanism of identification of the newly synthesized strand remains unclear. Several different MMR pathways have been identified and will not be discussed.

Double-Strand Breakage Repair X-rays, gamma rays, and particles released by radioactive atoms are all described as ionizing radiation because they generate ions as they pass through matter. Millions of gamma rays pass through our bodies every minute. When these forms of radiation collide with a fragile DNA molecule, they often break both strands of the double helix. Double-strand breaks (DSBs) can also be caused by certain chemicals, including several (e.g., bleomycin) used in cancer chemotherapy, and free radicals produced by normal cellular metabolism (page 35). DSBs are also introduced during replication of damaged DNA. A single double-strand break can cause serious chromosome abnormalities, which can have grave consequences for the cell. DSBs can be repaired by several alternate pathways. The predominant pathway in mammalian cells is called nonhomologous end joining (NHEJ), in which a complex of proteins binds to the broken ends of the DNA duplex and catalyzes a series of reactions that

2

h0GG1 o

Mismatch Repair

4

o

G

o

G

C

3

G C

G C

C

C

is a normal guanine, which is unable to fit into the active site of the glycosylase and is returned to the base stack. Failure to remove oxoG would have resulted in a G-to-T mutation. (BASED ON S. S. DAVID, WITH PERMISSION FROM NATURE 434:569, 2005. NATURE BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/ TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

13.2 DNA Repair

Figure 13.27 Detecting damaged bases during BER. In step 1, a DNA glycosylase (named hOGG1) is inspecting a nucleotide that is paired to a cytosine. In step 2, the nucleotide is flipped out of the DNA duplex. In this case, the base is an oxidized version of guanine, 8-oxoguanine, and it is able to fit into the active site of the enzyme (step 3) where it is cleaved from its attached sugar. The subsequent steps in BER were shown in Figure 13.26. In step 4, the extruded base

C

G

568

rejoin the broken strands. The major steps that occur during NHEJ are shown in Figure 13.28a and described in the accompanying legend. Figure 13.28b shows the nucleus of a human fibroblast that had been treated with a laser to induce a localized cluster of double-strand breaks and then stained for the presence of the protein Ku at various times after laser treatment. This NHEJ repair protein is seen to localize at the site of the DSBs immediately following their appearance.

1

Another DSB repair pathway known as homologous recombination (HR) requires a homologous chromosome to serve as a template for repair of the broken strand. The steps that occur during homologous recombination, which include excision of the damaged DNA, are similar to those of genetic recombination depicted in Figure 14.47. A comparison between these two DSB repair pathways shows major differences. Homologous recombination is a more accurate pathway; that is, there are fewer errors in the base sequence of the repaired DNA than NHEJ. However, because it requires that a homologous chromosome be present in the nucleus, HR can only be employed during the cell cycle after DNA replication takes place (i.e., during late S or G2 phase). Defects in both repair pathways have been linked to increased cancer susceptibility.

Ku

REVIEW 1. Contrast the events of nucleotide excision repair and base excision repair. 2. Why is it important in mismatch repair that the cell distinguish the parental strands from the newly synthesized strands?

2

DNA-PKcs 3

DNA Ligase IV

4

(a)

Immediate

2 hr

8 hr

Chapter 13 DNA Replication and Repair

(b)

Figure 13.28 Repairing double-strand breaks (DSBs) by nonhomologous end joining. (a) In this simplified model of doublestrand break repair, the lesion (step 1) is detected by a heterodimeric, ring-shaped protein called Ku, that binds to the broken ends of the DNA (step 2). The DNA-bound Ku recruits another protein, called DNA-PKcs, which is the catalytic subunit of a DNA-dependent protein kinase (step 3). Most of the substrates phosphorylated by this protein kinase have not been identified. These proteins bring the ends of the broken DNA together in such a way that they can be joined by DNA ligase IV to regenerate an intact DNA duplex (step 4). The NHEJ pathway may also involve the activities of nucleases and polymerases (not shown) and is more error prone than is the homologous recombination pathway of DSB repair. (b) Time course analysis of Ku localization at sites of DSB formation induced by laser microbeam irradiation at a site indicated by the arrowheads. The NHEJ protein Ku becomes localized at the damage site immediately following irradiation but remains there just briefly as the damage is presumably repaired. Micrographs were taken (1) immediately, (2) 2 hours, and (3) 8 hours after irradiation. (B: FROM JONG-SOO KIM ET AL, J. CELL BIOL. 170:344, 2005, FIG. 3, REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

13.3 | Between Replication and Repair The Human Perspective describes an inherited disease— xeroderma pigmentosum (XP)—that leaves patients with an inability to repair certain lesions caused by exposure to ultraviolet radiation. Patients described as having the “classical” form of XP have a defect in one of seven different genes involved in nucleotide excision repair (page 569). These genes are designated XPA, XPB, XPC, XPD, XPE, XPF, and XPG, and some of their roles in NER are indicated in the legend of Figure 13.25. Another group of patients were identified that, like those with XP, were highly susceptible to developing skin cancer as the result of sun exposure. However, unlike the cells from XP sufferers, cells from these patients were capable of nucleotide excision repair and were only slightly more sensitive to UV light than normal cells. This heightened UV sensitivity revealed itself during replication, as these cells often produced fragmented daughter strands following UV irradiation. Patients in this group were classified as having a variant form of XP, designated XP-V. We will return to the basis of the XP-V defect in a moment. We have seen in the previous section that cells can repair a great variety of DNA lesions. On occasion, however, a DNA lesion is not repaired by the time that segment of DNA is scheduled to undergo replication. On these occasions, the replication machinery arrives at the site of damage on the template strand and becomes stalled there. When this happens, some type of signal is emitted that leads to the recruitment of a specialized polymerase that is able to bypass the lesion.4 Suppose the lesion in question is 4

A cell has other options to deal with a stalled replication fork, but they are more complex and poorly understood, and will not be discussed.

569

a thymidine dimer (Figure 13.24) in a skin cell that was caused by exposure to UV radiation. When the replicative polymerase (pol ␦ or e) reaches the obstacle, the enzyme is temporarily replaced by a “specialized” DNA polymerase designated pol ␩, which is able to insert two A residues into the newly synthesized strand across from the two T residues that are covalently linked as part of the dimer. Once this “damage bypass” is accomplished, the cell switches back to the normal replicative polymerase and DNA synthesis continues without leaving any trace that a serious problem had been resolved. As you might have guessed from the juxtaposition of topics, patients afflicted with XP-V have a mutation in the gene encoding pol ␩ and thus have difficulty replicating past thymidine dimers.

T H E

H U M A N

Discovered in 1999, polymerase ␩ is a member of a family of DNA polymerases in which each member is specialized for incorporating nucleotides opposite particular types of DNA lesions in the template strand. The polymerases of this family are said to engage in translesion synthesis (TLS). X-ray crystallographic studies reveal that the TLS polymerases have an unusually spacious active site that is able to physically accommodate altered nucleotides that would not fit in the active site of a replicative polymerase. These TLS polymerases are only capable of incorporating one to a few nucleotides into a DNA strand (they lack processivity); they have no proofreading capability; and they are much more likely to incorporate an incorrect (i.e., noncomplementary) nucleotide when copying undamaged DNA than the classic polymerases.

P E R S P E C T I V E

The Consequences of DNA Repair Deficiencies determined by the specific mutation present in that gene. Structural studies of mutant XPD molecules suggest that these different mutations affect different functions of the protein. Elsewhere in this text, we have described circumstances that lead to premature (or accelerated) aging in humans or animal models: as the result of (1) increased free radicals (page 35), (2) increased mitochondrial DNA mutations (page 208), and (3) mutations in a protein of the nuclear envelope (page 490). In 2006, a 15-year-old boy who suffered from frequent sunburns and certain characteristics of premature aging came to the attention of clinical researchers. Genetic analysis determined that the boy carried a mutation in the XPF gene whose encoded protein makes one of the cuts during the NER pathway (Figure 13.25). Patients with mild mutations in XPF develop XP and have impaired NER. This individual had a more severe mutation in the XPF gene, causing his cells to be unable to repair covalent cross-links that form occasionally between the two strands of a DNA duplex. Studies on the cells of this individual, and on mice with a corresponding mutation, suggested that the unrepaired crosslinks lead to increased cell death (apoptosis), which either directly or indirectly promotes premature aging. According to one hypothesis, defects in DNA repair systems that result primarily in an increased mutation rate in the body’s cells are associated with an increased susceptibility to cancer, whereas defects in DNA repair systems that result primarily in cell death are associated with accelerated aging.a Whether any of these premature-aging syndromes provides insight into the mechanisms of normal aging remains a matter of debate. Persons with DNA-repair disorders are not the only individuals who should worry about exposure to the sun. Even in a skin cell whose repair enzymes are functioning at optimal levels, a small fraction of the lesions fail to be excised and replaced. Alterations in DNA lead to mutations that can cause a cell to become malignant. a

It has not been mentioned in this discussion, for a number of reasons, that two of the genes most often responsible for premature aging syndromes encode members of a particular type of DNA helicase family called RecQ helicases. The genes in question are WRN and BLM which, when mutated, are responsible for the inherited diseases Werner syndrome and Bloom syndrome, respectively, which are characterized by both increased cancer risk and features of accelerated aging. It is suggested that these helicases are involved in certain types of base excision and DSB repair pathways. They appear to be particularly important in resolving situations where a replicative DNA polymerase becomes stalled at a lesion and the replication fork “collapses” (disassembles).

13.3 Between Replication and Repair

We owe our lives to light from the sun, which provides the energy captured during photosynthesis. But the sun also emits a constant stream of ultraviolet rays that ages and mutates the cells of our skin. The hazardous effects of the sun are most dramatically illustrated by the rare recessive genetic disorder, xeroderma pigmentosum (XP). Patients with XP possess a deficient nucleotide excision repair system that cannot remove segments of DNA damaged by ultraviolet radiation. As a result, persons with XP are extremely sensitive to sunlight; even very limited exposure to the direct rays of the sun can produce large numbers of dark-pigmented spots on exposed areas of the body and a greatly elevated risk of developing disfiguring and fatal skin cancers. XP is not the only genetic disorder characterized by nucleotide excision repair deficiency. Cockayne syndrome (CS) is an inherited disorder characterized by acute sensitivity to light, neurological dysfunction due to demyelination of neurons, and dwarfism, but no evident increase in the frequency of skin cancer. Cells from persons with CS are deficient in the pathway by which transcriptionally active DNA is repaired (page 565). The remainder of the genome is repaired at the normal rate, presumably accounting for the normal levels of skin cancer. But why are persons with a defective repair mechanism subject to specific abnormalities such as dwarfism? Most cases of CS can be traced to a mutation in one of two genes, either CSA or CSB, which are thought to be involved in coupling transcription to DNA repair (see Figure 13.25). Mutations in these genes, in addition to impacting DNA repair, may also disturb the transcription of certain genes, leading to growth retardation and abnormal development of the nervous system. This possibility is strengthened by the finding that, in rare cases, the symptoms of CS can also occur in persons with XP who carry specific mutations in the XPD gene. As noted on page 565, XPD encodes a subunit of the transcription factor TFIIH required for transcription initiation. Mutations in XPD could lead to defects in both DNA repair and transcription. Certain other mutations in the XPD gene are responsible for another disease, trichothiodystrophy (TTD), which also combines symptoms suggestive of both DNA repair and transcription defects. Like CS patients, individuals with TTD exhibit increased sun sensitivity without the increased risk of development of cancer. TTD patients have additional symptoms, including brittle hair and scaly skin. These findings indicate that three distinct disorders—XP, CS, and TTD—can be caused by defects in a single gene, with the particular disease outcome likely

570 Thus, one of the consequences of the failure to correct UV-induced damage is the risk of skin cancer. Consider the following statistics: more than one million persons develop one of three forms of skin cancer every year in the United States, and most of these cases are attributed to overexposure to the sun’s ultraviolet rays. Fortunately, the two most common forms of skin cancer—basal cell carcinoma and squamous cell carcinoma—rarely spread to other parts of the body and can usually be excised in a doctor’s office. Both of these types of cancer originate from the skin’s epithelial cells. However, malignant melanoma, the third type of skin cancer, is a potential killer. Unlike the others, melanomas develop from pigment cells (melanocytes) in the skin. The number of cases of melanoma diagnosed in the United States is climbing at the alarming rate of 4 percent per year due to the increasing amount of time people have spent in the sun over the past few decades. Studies suggest that one of the greatest risk factors to developing melanoma as an adult is the occurrence of a severe, blistering sunburn as a child or adolescent. Individuals at greatest risk are Caucasians with extremely light skin. Many of these individuals have pigment cells whose surfaces lack a functioning receptor (called MC1R) for a hormone that is secreted by nearby epithelial cells of the skin in response to ultraviolet radiation. Melanocytes respond to MC1R activation by producing the dark pigment melanin, thereby providing the individual with a tan. Tanned skin is more protected from UV rays than is light, untanned skin, even though it is UV radiation that is responsible for

triggering the tanning response. What if it were possible to develop a tanned skin without having to suffer UV exposure? A number of research groups are working on such an approach by using various means other than exposure to UV-containing sunlight to stimulate the tanning response in pigment cells. Whether any of these approaches will prove safe and effective remains to be seen. Skin cancer is not the only disease that is promoted by deficient or overworked DNA repair systems. It is estimated that up to 15 percent of colon cancer cases can be attributed to mutations in the genes that encode the proteins required for mismatch repair. Mutations that cripple the mismatch repair system inevitably lead to a higher mutation rate in other genes because mistakes made during replication are not corrected. Cancer is also one of the consequences of double-strand DNA breaks that have either gone unrepaired or been repaired incorrectly. Breaks in DNA can be caused by a variety of environmental agents to which we are commonly exposed, including X-rays, gamma rays, and radioactive emissions. The most serious environmental hazard in this regard is probably radon (specifically 222Rn), a radioactive isotope formed during the disintegration of uranium. Some areas of the planet contain relatively high levels of uranium in the soil, and houses built in these regions can contain dangerous levels of radon gas. When the gas is breathed into the lungs, it can lead to double-strand DNA breaks that increase the risk of lung cancer. A significant fraction of lungcancer deaths in nonsmokers is probably due to radon exposure.

Chapter 13 DNA Replication and Repair

| Synopsis DNA replication occurs semiconservatively, which indicates that one intact strand of the parent duplex is transmitted to each of the daughter cells during cell division. This mechanism of replication was first suggested by Watson and Crick as part of their model of DNA structure. They suggested that replication occurred by gradual separation of the strands by means of hydrogen bond breakage, so that each strand could serve as a template for the synthesis of a complementary strand. This model was soon confirmed in both bacterial and eukaryotic cells by showing that cells transferred to labeled media for one generation produce daughter cells whose DNA has one labeled strand and one unlabeled strand. (p. 546) The mechanism of replication is best understood in bacterial cells. Replication begins at a single origin on the circular bacterial chromosome and proceeds outward in both directions as a pair of replication forks. Replication forks are sites where the double helix is unwound and nucleotides are incorporated into both newly synthesized strands. (p. 549) DNA synthesis is catalyzed by a family of DNA polymerases. The first of these enzymes to be characterized was DNA polymerase I of E. coli. To catalyze the polymerization reaction, the enzyme requires all four deoxyribonucleoside triphosphates, a template strand to copy, and a primer containing a free 3⬘ OH to which nucleotides can be added. The primer is required because the enzyme cannot initiate the synthesis of a DNA strand. Rather, it can only add nucleotides to the 3⬘ hydroxyl terminus of an existing strand. Another unexpected characteristic of DNA polymerase I is that it only polymerizes a strand in a 5⬘ → 3⬘ direction. It had been presumed that the two new strands would be synthesized in opposite directions by polymerases moving in opposite directions along the two parental template strands. This finding was explained when it was shown that the two strands were synthesized quite differently. (p. 550) One of the newly synthesized strands (the leading strand) grows

toward the replication fork and is synthesized continuously. The other newly synthesized strand (the lagging strand) grows away from the fork and is synthesized discontinuously. In bacterial cells, the lagging strand is synthesized as fragments approximately 1000 nucleotides long, called Okazaki fragments, that are covalently joined to one another by a DNA ligase. In contrast, the leading strand is synthesized as a single continuous strand. Neither the continuous strand nor any of the Okazaki fragments can be initiated by the DNA polymerase but instead begin as a short RNA primer that is synthesized by a type of RNA polymerase called primase. After the RNA primer is assembled, DNA polymerase continues to synthesize the strand or fragment as DNA. The RNA is subsequently degraded, and the gap is filled in as DNA. (p. 552) Events at the bacterial replication fork require a variety of different types of proteins having specialized functions. These include a DNA gyrase, which is a type II topoisomerase required to relieve the tension that builds up ahead of the fork as a result of DNA unwinding; a DNA helicase that unwinds the DNA by separating the strands; singlestranded DNA-binding proteins that bind selectively to singlestranded DNA and prevent reassociation; a primase that synthesizes the RNA primers; and a DNA ligase that seals the fragments of the lagging strand into a continuous polynucleotide. DNA polymerase III is the primary DNA-synthesizing enzyme that adds nucleotides to each RNA primer, whereas DNA polymerase I is responsible for removing the RNA primers and replacing them with DNA. Two molecules of DNA polymerase III are thought to move together as a complex along their respective template strands. This is accomplished as the lagging-strand template loops back on itself. (p. 553) DNA polymerases possess separate catalytic sites for polymerization and degradation of nucleic acid strands. Most DNA polymerases possess both 5⬘ → 3⬘ and 3⬘ → 5⬘ exonuclease activities. The first acts to degrade the RNA primers that begin each Okazaki frag-

571 ment, and the second removes inappropriate nucleotides following their mistaken incorporation, thus contributing to the fidelity of replication. It is estimated that approximately one in 109 nucleotides is incorporated incorrectly during replication in E. coli. (p. 556) Replication in eukaryotic cells follows a similar mechanism and employs similar proteins to those of prokaryotes. All of the DNA polymerases involved in replication elongate DNA strands in the 5⬘ → 3⬘ direction. None of them initiates the synthesis of a chain without a primer. Most possess a 3⬘ → 5⬘ exonuclease activity, ensuring that replication occurs with high fidelity. Unlike bacteria, replication in eukaryotes is initiated simultaneously at many sites along a chromosome, with replication forks proceeding outward in both directions from each site of initiation. Studies on yeast indicate that origins of replication contain a specific binding site for an essential multiprotein complex called ORC. Events at the origin ensure that replication of each DNA segment occurs once and only once per cell cycle. (p. 558) Replication in eukaryotic cells is intimately associated with nuclear structures. In eukaryotes, replication forks that are active at any given time are localized within about 50 to 250 sites called replication foci. Newly synthesized DNA is rapidly associated with nucleosomes. According to one model, (H3H4)2 tetramers present prior to replication remain intact and are passed on to the daughter duplexes, whereas H2A/ H2B dimers separate from one another and bind randomly to new and old (H3H4)2 tetramers on the daughter duplexes. (p. 562) DNA is subject to damage by many environmental influences, including ionizing radiation, common chemicals, and ultraviolet radiation. Cells possess a variety of systems to recognize and repair the

resulting damage. It is estimated that less than one base change in a thousand escapes a cell’s repair systems. Four major types of DNA repair systems are discussed. Nucleotide excision repair (NER) systems operate by removing a small section of a DNA strand containing a bulky lesion, such as a pyrimidine dimer. During NER, the strands of DNA containing the lesion are separated by a helicase; paired incisions are made by endonucleases; the gap is filled by a DNA polymerase; and the strand is sealed by a DNA ligase. The template strands of genes that are actively transcribed are preferentially repaired by NER. Base excision repair removes a variety of altered nucleotides that produce minor distortions in the DNA helix. Cells possess a variety of glycosylases that recognize and remove various types of altered bases. Once the base is removed, the remaining portion of the nucleotide is removed by an endonuclease, the gap is enlarged by a phosphodiesterase activity, and the gap is filled and sealed by a polymerase and ligase. Mismatch repair is responsible for removing incorrect nucleotides incorporated during replication that escape the proofreading activity of the polymerase. In bacteria, the newly synthesized strand is selected for repair by virtue of its lack of methyl groups compared to the parental strand. Double-strand breaks are repaired as proteins bind to the broken strands and join the ends together. An alternate pathway of DSB repair, called homologous recombination, is not discussed. (p. 564) In addition to the classic DNA polymerases involved in DNA replication and repair, cells also possess an array of DNA polymerases that facilitate replication at sites of DNA lesions or misalignments. These polymerases, which engage in translesion synthesis, lack processivity and proofreading capability and are more error-prone than classic polymerases. (p. 568)

| Analytic Questions 1. Suppose that Meselson and Stahl had carried out their experi-

2.

3.

4.

5. 6.

7. Suppose the error rate during replication in human cells were

8.

9.

10.

11. 12.

13.

the same as that of bacteria (about 10⫺9). How would this impact the two cells differently? Figure 13.19 shows the results from an experiment in which cells were incubated with [3H]thymidine for less than 30 minutes prior to fixation. How would you expect this photograph to appear after a one-hour labeling period? Can you conclude that the entire genome is replicated within an hour? If not, why not? Origins of replication tend to have a region that is very rich in A-T base pairs. What function do you suppose these sections might serve? What are the advantages of replication occurring in a small number of replication foci rather than in the general nucleoplasm? What are some of the reasons you might expect human cells to have more efficient repair systems than those of a frog? Suppose you were to compare autoradiographs of two cells that had been exposed to [3H]thymidine, one that was engaged in DNA replication (S phase) and another that was not. How would you expect autoradiographs of these cells to differ? Construct a model that would explain how transcriptionally active DNA is repaired preferentially over transcriptionally silent DNA.

Analytic Questions

ment by growing cells in medium with 14N and then transferring the cells to medium containing 15N. How would the bands within the centrifuge tubes have appeared if replication were semiconservative? If replication were conservative? If replication were dispersive? Suppose you isolated a mutant strain of yeast that replicated its DNA more than once per cell cycle. In other words, each gene in the genome was replicated several times between successive cell divisions. How might you explain such a phenomenon? How would the chromosomes from the experiment on eukaryotic cells depicted in Figure 13.4 have appeared if replication occurred by a conservative or a dispersive mechanism? We have seen that cells possess a special enzyme to remove uracil from DNA. What do you suppose would happen if the uracil groups were not removed? (You might consider the information presented in Figure 11.44 on the pairing properties of uracil.) Draw a partially double-stranded DNA molecule that would not serve as a template for DNA synthesis by DNA polymerase I. Some temperature-sensitive bacterial mutants stop replication immediately following elevation of temperature, whereas others continue to replicate their DNA for a period of time before they cease this activity, and still others continue until a round of replication is completed. How might these three types of mutants differ?

572

14 Cellular Reproduction 14.1 The Cell Cycle 14.2 M Phase: Mitosis and Cytokinesis 14.3 Meiosis THE HUMAN PERSPECTIVE: Meiotic Nondisjunction and Its Consequences EXPERIMENTAL PATHWAYS: The Discovery and Characterization of MPF

A ccording to the third tenet of the cell theory, new cells originate only from other living cells. The process by which this occurs is called cell division. For a multicellular organism, such as a human or an oak tree, countless divisions of a single-celled zygote produce an organism of astonishing cellular complexity and organization. Cell division does not stop with the formation of the mature organism but continues in certain tissues throughout life. Millions of cells residing within the marrow of your bones or the lining of your intestinal tract are undergoing division at this very moment. This enormous output of cells is needed to replace cells that have aged or died. Although cell division occurs in all organisms, it takes place very differently in prokaryotes and eukaryotes. We will restrict discussion to the eukaryotic version. Two distinct types of eukaryotic cell division will be discussed in this chapter. Mitosis leads to production of cells that are genetically identical to their parent, whereas meiosis leads to production of cells with half the genetic content of the parent. Mitosis serves as the basis for producing new cells, meiosis as the basis for producing new

Fluorescence micrograph of a mitotic spindle that had assembled in a cell-free extract prepared from frog eggs, which are cells that lack a centrosome. The red spheres consist of chromatin-covered beads that were added to the extract. It is evident from this micrograph that a bipolar spindle can assemble in the absence of both chromosomes and centrosomes. In this experiment, the chromatin-covered beads served as nucleating sites for the assembly of the microtubules that subsequently formed this spindle. The mechanism by which cells construct mitotic spindles in the absence of centrosomes is discussed on page 587. (FROM R. HEALD, ET AL., NATURE VOL. 382, COVER OF 8/1/96; © 1996, REPRINTED WITH PERMISSION FROM MACMILLAN PUBLISHERS, LIMITED.)

573 sexually reproducing organisms. Together, these two types of cell division form the links in the chain between parents and their

offspring and, in a broader sense, between living species and the earliest eukaryotic life forms present on Earth.

14.1 | The Cell Cycle

thesized DNA. If [3H]thymidine is given to a culture of cells for a short period (e.g., 30 minutes) and a sample of the cell population is fixed, dried onto a slide, and examined by autoradiography, only a fraction of the cells are found to have radioactive nuclei. Among cells that were engaged in mitosis at the time of fixation (as evidenced by their compacted chromosomes) none is found to have a radioactively labeled nucleus. These mitotic cells have unlabeled chromosomes because they were not engaged in DNA replication during the labeling period. If labeling is allowed to continue for one or two hours before the cells are sampled, there are still no cells with labeled mitotic chromosomes (Figure 14.2). We can conclude from these results that there is a definite period of time between the end of DNA synthesis and the beginning of M phase. This period is termed G2 (for second gap). The duration of G2 is revealed as one continues to take samples of cells from the culture until labeled mitotic chromosomes are observed. The first cells whose mitotic chromosomes are labeled must have been at the last stages of DNA synthesis at the start of the incubation with [3H]thymidine. The length of time between the start of the labeling period and the appearance of cells with labeled mitotic figures corresponds to the duration of G2. DNA replication occurs during a period of the cell cycle termed S phase. S phase is also the period when the cell synthesizes the additional histones that will be needed as the cell doubles the number of nucleosomes in its chromosomes (see Figure 13.23). The length of S phase can be determined directly. In an asynchronous culture, the percentage of cells engaged in a particular activity is an approximate measure of the percentage of time that this activity occupies in the lives of cells. Thus, if we know the length of the entire cell cycle, the

In a population of dividing cells, whether inside the body or in a culture dish, each cell passes through a series of defined stages, which constitutes the cell cycle (Figure 14.1). The cell cycle can be divided into two major phases based on cellular activities readily visible with a light microscope: M phase and interphase. M phase includes (1) the process of mitosis, during which duplicated chromosomes are separated into two nuclei, and (2) cytokinesis, during which the entire cell divides into two daughter cells. Interphase, the period between cell divisions, is a time when the cell grows and engages in diverse metabolic activities. Whereas M phase usually lasts only an hour or so in mammalian cells, interphase may extend for days, weeks, or longer, depending on the cell type and the conditions. Although M phase is the period when the contents of a cell are actually divided, numerous preparations for an upcoming mitosis occur during interphase, including replication of the cell’s DNA. One might guess that a cell engages in replication throughout interphase. However, studies in the early 1950s on asynchronous cultures (i.e., cultures whose cells are randomly distributed throughout the cell cycle) showed that this is not the case. As described in Chapter 13, DNA replication can be monitored by the incorporation of [3H]thymidine into newly synInterphase

G1: Cell grows and carries out normal metabolism; organelles duplicate

s oossi si M iitt s esi se ha p o l

G2: Cell grows and prepares for mitosis

se P ro p h a

se

etaph

ase

Figure 14.1 An overview of the eukaryotic cell cycle. This diagram of the cell cycle indicates the stages through which a cell passes from one division to the next. The cell cycle is divided into two major phases: M phase and interphase. M phase includes the successive events of mitosis and cytokinesis. Interphase is divided into G1, S, and G2 phases, with S phase being equivalent to the period of DNA synthesis. The division of interphase into three separate phases based on the timing of DNA synthesis was first proposed in 1953 by Alma Howard and Stephen Pelc of Hammersmith Hospital, London, based on their experiments on plant meristem cells.

14.1 The Cell Cycle

Mi ((MM tosis pphh aassee ))

P ro m

tap

ha

ha Me

An

Te

se

kin

ap

o Cyt

S: DNA replication and chromosome duplication

Percentage of labeled mitoses

574 100

80

60

40

20

10

5

15

20

25

30

Hours after 3H–thymidine addition

Figure 14.2 Experimental results demonstrating that replication occurs during a defined period of the cell cycle. HeLa cells were cultured for 30 minutes in medium containing [3H]thymidine and then incubated (chased) for various times in unlabeled medium before being fixed and prepared for autoradiography. Each culture dish was scanned for cells that were in mitosis at the time they were fixed, and the percentage of those mitotic cells whose chromosomes were labeled was plotted as shown. (FROM A STUDY BY R. BASERGA AND F. WIEBEL.)

length of S phase can be calculated directly from the percentage of the cells whose nuclei are radioactively labeled during a brief pulse with [3H]thymidine. Similarly, the length of M phase can be calculated from the percentage of cells in the population that are seen to be engaged in mitosis or cytokinesis. When one adds up the periods of G2 ⫹ S ⫹ M, it is apparent that there is an additional period in the cell cycle yet to be accounted for. This other phase, termed G1 (for first gap), is the period following mitosis and preceding DNA synthesis.

Cell Cycles in Vivo One of the properties that distinguishes various types of cells within a multicellular plant or animal is their capacity to grow and divide. We can recognize three broad categories of cells:

Chapter 14 Cellular Reproduction

1. Cells, such as nerve cells, muscle cells, or red blood cells, that

are highly specialized and lack the ability to divide. Once these cells have differentiated, they remain in that state until they die. 2. Cells that normally do not divide but can be induced to begin DNA synthesis and divide when given an appropriate stimulus. Included in this group are liver cells, which can be induced to proliferate by the surgical removal of part of the liver, and lymphocytes, which can be induced to proliferate by interaction with an appropriate antigen. 3. Cells that normally possess a relatively high level of mitotic activity. Included in this category are stem cells of various adult tissues, such as hematopoietic stem cells that give rise to red and white blood cells (Figure 17.6) and stem cells at the base of numerous epithelia that line the body cavities and the body surface (Figure 7.1). The relatively unspecialized cells of apical meristems located near the tips of plant roots and stems also exhibit rapid and continual cell division. Stem cells have an important property that is not shared by most cells; they are able to divide asymmetrically.

An asymmetric cell division is one in which the two daughter cells have different sizes, components, or fates. The asymmetric division of a stem cell produces one daughter cell that remains an uncommitted stem cell like its parent and another daughter cell that has taken a step towards becoming a differentiated cell of that tissue. In other words, asymmetric divisions allow stem cells to engage in both selfrenewal and the formation of differentiated tissue cells. Some types of nonstem cells can also engage in asymmetric (or unequal) cell divisions, as illustrated by the formation of oocytes and polar bodies in Figure 14.41b and the division of the T cell in the Chapter 17 opening photo. Cell cycles can range in length from as short as 30 minutes in a cleaving frog embryo, whose cell cycles lack both G1 and G2 phases, to several months in slowly growing tissues, such as the mammalian liver. Many cells in the body are said to be quiescent, which means that they are in a state that will not lead them to an upcoming cell division, but they retain the capability to divide if conditions should change. With a few notable exceptions, cells that have stopped dividing are arrested in a stage preceding the initiation of DNA synthesis. Quiescent cells are often described as being in the G0 state to distinguish them from the typical G1-phase cells that may soon enter S phase. A cell must receive a growth-promoting signal to proceed from G0 into G1 phase and thus reenter the cell cycle.

Control of the Cell Cycle The study of the cell cycle is not only important in basic cell biology, but also has enormous practical implications in combating cancer, a disease that results from a breakdown in a cell’s ability to regulate its own division. In 1970, a series of cell fusion experiments carried out by Potu Rao and Robert Johnson of the University of Colorado helped open the door to understanding how the cell cycle is regulated. Rao and Johnson wanted to know whether the cytoplasm of cells contains regulatory factors that affect cell cycle activities. They approached the question by fusing mammalian cells that were in different stages of the cell cycle. In one experiment, they fused mitotic cells with cells in other stages of the cell cycle. The mitotic cell always induced compaction of the chromatin in the nucleus of the nonmitotic cell (Figure 14.3). If a G1-phase and an M-phase cell were fused, the chromatin of the G1-phase nucleus underwent premature chromosomal compaction to form a set of elongated compacted chromosomes (Figure 14.3a). If a G2-phase and M-phase cell were fused, the G2 chromosomes also underwent premature chromosome compaction, but unlike those of a G1 nucleus, the compacted G2 chromosomes were visibly doubled, reflecting the fact that replication had already occurred (Figure 14.3c). If a mitotic cell was fused with an S-phase cell, the S-phase chromatin also became compacted (Figure 14.3b). However, replicating DNA is especially sensitive to damage, so that compaction in the S-phase nucleus led to the formation of “pulverized” chromosomal fragments rather than intact, compacted chromosomes. The results of these experiments suggested that the cytoplasm of a mitotic cell contained diffusible factors that could induce mitosis in a nonmitotic (i.e., inter-

575 Mitotic Chromosomes

Mitotic Chromosomes

Pulverized chromosome fragments

G1 Chromosomes

Mitotic Chromosomes

G2 Chromosomes

(a)

(b)

(c)

Figure 14.3 Experimental demonstration that cells contain factors that stimulate entry into mitosis. The photographs show the results of the fusion of an M-phase HeLa cell with a rat kangaroo PtK2 cell that had been in (a) G1 phase, (b) S phase, or (c) G2 phase at the time of cell fusion. As described in the text, the chromatin of the G1-phase and G2-phase PtK2 cells undergoes premature compaction, whereas that of

the S-phase cell becomes pulverized. The elongated chromatids of the G2-phase cell in c are doubled in comparison with those of the G1 cell in a. (FROM KARL SPERLING AND POTU N. RAO, HUMANGENETIK 23:437, 1974. WITH KIND PERMISSION OF SPRINGER SCIENCE+ BUSINESS MEDIA.)

phase) cell. This finding suggested that the transition from G2 to M was under positive control; that is, the transition was induced by the presence of some stimulatory agent.

kinases (Cdks). It has been found that Cdks are not only involved in M phase but are the key agents that orchestrate activities throughout the cell cycle. Cdks carry out this function Mitosis

Interphase MPF activity

Mitosis

Interphase

Mitosis

Interphase High

Low Cyclin

High Low

Figure 14.4 Fluctuation of cyclin and MPF levels during the cell cycle. This drawing depicts the cyclical changes that occur during early frog development when mitotic divisions occur rapidly and synchronously in all cells of the embryo. The top tracing shows the alternation between periods of mitosis and interphase, the middle tracing shows the cyclical changes in MPF kinase activity, and the lower tracing shows the cyclical changes in the concentrations of cyclins that control the relative activity of the MPF kinase. (FROM A. W. MURRAY AND M. W. KIRSCHNER, SCIENCE 246:616, 1989; COPYRIGHT 1989, AAAS. SCIENCE BY MOSES KING, REPRODUCED WITH PERMISSION OF AMERICAN

ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

14.1 The Cell Cycle

The Role of Protein Kinases While the cell-fusion experiments revealed the existence of factors that regulated the cell cycle, they provided no information about the biochemical properties of these factors. Insights into the nature of the agents that promote entry of a cell into mitosis (or meiosis) were first gained in a series of experiments on the oocytes and early embryos of frogs and invertebrates. These experiments are described in the Experimental Pathways at the end of this chapter. To summarize here, it was shown that entry of a cell into M phase is initiated by a protein called maturationpromoting factor (MPF ). MPF consists of two subunits: (1) a subunit with kinase activity that transfers phosphate groups from ATP to specific serine and threonine residues of specific protein substrates and (2) a regulatory subunit called cyclin. The term cyclin was coined because the concentration of this regulatory protein rises and falls in a predictable pattern with each cell cycle (Figure 14.4). When the cyclin concentration is low, the kinase lacks the cyclin subunit and, as a result, is inactive. When the cyclin concentration rises, the kinase is activated, causing the cell to enter M phase. These results suggested that (1) progression of cells into mitosis depends on an enzyme whose sole activity is to phosphorylate other proteins, and (2) the activity of this enzyme is controlled by a subunit whose concentration varies from one stage of the cell cycle to another. Over the past two decades, a large number of laboratories have focused on MPF-like enzymes, called cyclin-dependent

Chapter 14 Cellular Reproduction

576

by phosphorylating a diverse array of proteins. Each phosphorylation event occurs at an appropriate point during the cell cycle, thereby stimulating or inhibiting a particular cellular process involved in cell division. Yeast cells have been particularly useful in studies of the cell cycle, at least in part because of the availability of temperature-sensitive mutants whose abnormal proteins affect various cell cycle processes. As discussed on page 549, temperature-sensitive mutants can be grown in a relatively normal manner at a lower (permissive) temperature and then shifted to a higher (restrictive) temperature to study the effect of the mutant gene product. Researchers studying the genetic control of the cell cycle have focused on two distantly related yeast species, the budding yeast Saccharomyces cerevisiae, which reproduces through the formation of buds at one end of the cell (see Figure 1.18b), and the fission yeast, Schizosaccharomyces pombe, which reproduces by elongating itself and then splitting into two equalsized cells (see Figure 14.6). The molecular basis of cell cycle regulation has been remarkably conserved throughout the evolution of eukaryotes. Once a gene involved in cell cycle control has been identified in one of the two yeast species, homologues are sought—and usually found—in the genomes of higher eukaryotes, including humans. By combining genetic, biochemical, and live-cell analyses, investigators have gained a comprehensive understanding of the major activities that allow a cell to grow and reproduce in a laboratory culture dish. Research into the genetic control of the cell cycle in yeast began in the 1970s in two laboratories, initially that of Leland Hartwell at the University of Washington working on budding yeast and subsequently that of Paul Nurse at the University of Oxford working on fission yeast. Both laboratories identified a gene that, when mutated, would cause the growth of cells at elevated temperature to stop at certain points in the cell cycle. The product of this gene, which was called cdc2 in fission yeast (and CDC28 in budding yeast), was eventually found to be homologous to the catalytic subunit of MPF; in other words, it was a cyclin-dependent kinase. Subsequent research on yeast as well as many different vertebrate cells has supported the concept that the progression of a eukaryotic cell through its cell cycle is regulated at distinct stages. One of the primary stages of regulation occurs near the end of G1 and another near the end of G2. These stages represent points in the cell cycle where a cell becomes committed to beginning a crucial event—either initiating replication or entering mitosis. We will begin our discussion with fission yeast, which has the least complex cell cycle. In this species, the same Cdk (cdc2) is responsible for passage through both points of commitment, though in partnership with different cyclins. A simplified representation of cell cycle regulation in fission yeast is shown in Figure 14.5. The first transition point, which is called START, occurs in late G1. Once a cell has passed START, it is irrevocably committed to replicating its DNA and, ultimately, completing the cell cycle.1 Passage through 1

Mammalian cells pass through a comparable point during G1, referred to as the restriction point, at which time they become committed to DNA replication and ultimately to completing mitosis. Prior to the restriction point, mammalian cells require the presence of growth factors in their culture medium if they are to progress in the cell cycle. After they have passed the restriction point, these same cells will continue through the remainder of the cell cycle without external stimulation.

cdc2 kinase

cdc2 kinase

Mitotic cyclins

G1/S cyclins

G1

S

G2

M

START

Mitotic cyclins

G1/S cyclins G1

S

G2

M

Figure 14.5 A simplified model for cell cycle regulation in fission yeast. The cell cycle is controlled primarily at two points, START and the G2–M transition. Passage of a cell through these two critical junctures (black arrows) requires the activation of the same cdc2 kinase by different classes of cyclins, either G1/S or mitotic cyclins. A third major transition occurs at the end of mitosis and is triggered by a rapid drop in concentration of mitotic cyclins. (Note: cdc 2 is also known as Cdk1.)

START requires the activation of cdc2 by one or more G1/S cyclins, whose levels rise during late G1 (Figure 14.5). Passage from G2 to mitosis requires activation of cdc2 by a different group of cyclins—the mitotic cyclins. Cdks containing a mitotic cyclin (e.g., MPF described on page 611) phosphorylate substrates that are required for the cell to enter mitosis. Included among the substrates are proteins required for the dynamic changes in organization of both the chromosomes and cytoskeleton that characterize the shift from interphase to mitosis. Cells make a third commitment during the middle of mitosis, which determines whether they will complete cell division and reenter G1 of the next cycle. Exit from mitosis and entry into G1 depends on a rapid decrease in Cdk activity that results from a plunge in concentration of the mitotic cyclins (Figure 14.5), an event that will be discussed on page 592 in conjunction with other mitotic activities. Cyclin-dependent kinases are often described as the “engines” that drive the cell cycle through its various stages. The activities of these enzymes are regulated by a variety of “brakes” and “accelerators” that operate in combination with one another. These include: Cyclin Binding As indicated in Figure 14.5, the levels of particular cyclins rise over time. When a cyclin reaches a sufficient concentration in the cell, it binds to the catalytic subunit of a Cdk, causing a major change in the conformation of the enzyme’s active site. X-ray crystallographic structures of various cyclin–Cdk complexes indicate that cyclin binding causes

577

the movement of a flexible loop of the Cdk polypeptide chain away from the opening of the active site, allowing the Cdk to phosphorylate its protein substrates. Cdk Phosphorylation/dephosphorylation We have already seen in other chapters that many events that take place in a cell are regulated by the addition and removal of phosphate groups from proteins. The same is true for the events that lead to the onset of mitosis. We can see from Figure 14.5 that the level of mitotic cyclins rises through S and G2. The mitotic cyclins present in a yeast cell during this period bind to the Cdk to form a cyclin–Cdk complex, but the complex shows little evidence of kinase activity. Then, late in G2, the cyclin–Cdk becomes activated and mitosis is triggered. To understand this change in Cdk activity, we have to look at the activity of three other regulatory enzymes—two kinases and a phosphatase. We will look briefly at the events that occur in fission yeast (Figure 14.6a). The roles of these enzymes in the fission yeast cycle, which is illustrated in Figure 14.6b, was revealed through a combination of genetic and biochemical analyses. In step 1, one of the kinases, called CAK (Cdk-activating kinase), phosphorylates a critical threonine residue (Thr 161 of cdc2 in Figure 14.6b). Phosphorylation of this residue is necessary, but not sufficient, for the Cdk to be active. A second protein kinase shown in step 1, called Wee1, phosphorylates a key tyrosine residue in the ATP-binding pocket of the enzyme (Tyr 15 of cdc2 in Figure 14.6b). If this residue is phosphorylated, the enzyme is inactive, regardless of the phosphorylation state of any

other residue. In other words, the effect of Wee1 overrides the effect of CAK, keeping the Cdk in an inactive state. Line 2 of Figure 14.6c shows the phenotype of cells with a mutant wee1 gene. These mutants cannot maintain the Cdk in an inactive state and divide at an early stage in the cell cycle producing smaller cells, hence the name “wee.” In normal (wild-type) cells, Wee1 keeps the Cdk inactive until the end of G2. Then, at the end of G2, the inhibitory phosphate at Tyr 15 is removed by the third enzyme, a phosphatase named Cdc25 (step 2, Figure 14.6b). Removal of this phosphate switches the stored cyclin–Cdk molecules into the active state, allowing it to phosphorylate key substrates and drive the yeast cell into mitosis. Line 3 of Figure 14.6c shows the phenotype of cells with a mutant cdc25 gene. These mutants cannot remove the inhibitory phosphate from the Cdk and cannot enter mitosis. The balance between Wee1 kinase and Cdc25 phosphatase activities, which normally determines whether the cell will remain in G2 or progress into mitosis, is regulated by still other kinases and phosphatases. As we will see shortly, these pathways can stop the cell from entering mitosis under conditions that might lead to an abnormal cell division. Cdk Inhibitors Cdk activity can be blocked by a variety of inhibitors. In budding yeast, for example, a protein called Sic1 acts as a Cdk inhibitor during G1. The degradation of Sic1 allows the cyclin–Cdk that is present in the cell to initiate DNA replication. The role of Cdk inhibitors in mammalian cells is discussed on page 581.

1

Mitosis

2

3

Interphase (G1)

Interphase (G2) Thr161– P cdc2 kinase Inactive

CAK Wee1

Cyclin

Tyr15– P

Thr161– P Cdc25

cdc2 kinase

cdc2 kinase

G2 fission yeast cell

Cyclin

Cyclin

Inactive (a)

cdc2 kinase Post-mitotic fission yeast cells

Active

Cyclin

(b)

Degradation 1

Wild type G2

2

wee1–

3

cdc25–

G2

M

M

(c)

mutant cdc25 gene; the cell does not divide but continues to grow. The red arrow marks the time when the temperature is raised to inactivate the mutant protein. (A: STEVE GSCHMEISSNER/PHOTO RESEARCHERS, INC. B: AFTER T. R. COLEMAN AND W. G. DUNPHY, CURR. OPIN. CELL BIOL. 76:877, 1994. CURRENT OPINION IN CELL BIOLOGY BY ELSEVIER LTD. REPRODUCED WITH PERMISSION OF ELSEVIER LTD. IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

14.1 The Cell Cycle

Figure 14.6 Progression through the fission yeast cell cycle requires the phosphorylation and dephosphorylation of critical cdc2 residues. (a) Colorized scanning electron micrograph of wild type fission yeast cells. (b) During G2, the cdc2 kinase interacts with a mitotic cyclin but remains inactive as the result of phosphorylation of a key tyrosine residue (Tyr 15 in fission yeast) by Wee1 (step 1). A separate kinase, called CAK, transfers a phosphate to another residue (Thr 161), which is required for cdc2 kinase activity later in the cell cycle. When the cell reaches a critical size, an enzyme called Cdc25 phosphatase is activated, which removes the inhibitory phosphate on the Tyr 15 residue. The resulting activation of the cdc2 kinase drives the cell into mitosis (step 2). By the end of mitosis (step 3), the stimulatory phosphate group is removed from Thr 161 by another phosphatase. The free cyclin is subsequently degraded, and the cell begins another cycle. (The mitotic Cdk in mammalian cells is phosphorylated and dephosphorylated in a similar manner.) (c) Identification of Wee1 kinase and Cdc25 phosphatase was made by studying mutants that behaved as shown in this figure. Line 1 shows the G2 and M stages of a wild-type cell. Line 2 shows the effect of a mutant wee1 gene; the cell divides prematurely, forming small (wee) cells. Line 3 shows the effect of a

Inactive

578

Controlled Proteolysis It is evident from Figures 14.4 and 14.5 that cyclin concentrations oscillate during each cell cycle, which leads to changes in the activity of Cdks. Cells regulate the concentration of cyclins, and other key cell cycle proteins, by adjusting both the rate of synthesis and the rate of destruction of the molecule at different points in the cell cycle. Degradation is accomplished by means of the ubiquitin–proteasome pathway described on page 541. Unlike other mechanisms that control Cdk activity, degradation is an irreversible event that helps drive the cell cycle in a single direction. Regulation of the cell cycle requires two classes of multisubunit complexes (SCF and APC complexes) that function as ubiquitin ligases. These complexes recognize proteins to be degraded and link these proteins to a polyubiquitin chain, which ensures their destruction in a proteasome. The SCF complex is active from late G1 through early mitosis (see Figure 14.26a) and mediates the destruction of G1/S cyclins, Cdk inhibitors, and other cell cycle proteins. These proteins become targets for an SCF after they are phosphorylated by the protein kinases (i.e., Cdks) that regulate the cell cycle. Mutations that inhibit SCFs from mediating proteolysis of key proteins, such as G1/S cyclins or the Cdk inhibitor Sic1 mentioned above, can prevent cells from entering S phase and replicating their DNA. The APC complex acts in mitosis and degrades a number of key mitotic proteins, including the mitotic cyclins. Destruction of the mitotic cyclins allows a cell to exit mitosis and enter a new cell cycle (page 592).

Chapter 14 Cellular Reproduction

Subcellular Localization Cells contain a number of different compartments in which regulatory molecules can either be united with or separated from the proteins they interact with. Subcellular localization is a dynamic phenomenon in which cell cycle regulators are moved into different compartments at different stages. For example, one of the major mitotic cyclins in animal cells (cyclin B1) shuttles between the nucleus and cytoplasm until G2, when it accumulates in the nucleus just prior to the onset of mitosis (Figure 14.7). According to one proposal, nuclear accumulation of cyclin B1 is facilitated by phosphorylation of one or more serine residues that reside in its nuclear export signal (NES, page 492). In this model, phosphorylation blocks subsequent export of the cyclin back to the cytoplasm. According to an alternate proposal, cyclin B1Cdk1 stimulates it own translocation into the nucleus by phosphorylating and activating components of the nuclear import machinery. Regardless of the mechanism, if nuclear accumulation of the cyclin is blocked, cells fail to initiate cell division. As noted above, the proteins and processes that control the cell cycle are remarkably conserved among eukaryotes. As in yeast, successive waves of synthesis and degradation of different cyclins play a key role in driving mammalian cells from one stage to the next. Unlike yeast cells, which have a single Cdk, mammalian cells produce several different versions of this protein kinase. Different cyclin–Cdk complexes target different groups of substrates at different points within the cell cycle. The pairing between individual cyclins and Cdks is specific, and only certain combinations are found (Figure 14.8a). In mammalian cells, for example, the activity of a cyclin E–Cdk2 complex drives the cell into S phase, whereas activity of a cyclin B1–Cdk1 complex (the mammalian MPF)

(a)

(b)

Figure 14.7 Experimental demonstration of subcellular localization during the cell cycle. Micrographs of a living HeLa cell that has been injected with cyclin B1 linked to the green fluorescent protein (page 273). The cell shown in a is in the G2 phase of its cell cycle, and the fluorescently labeled cyclin B1 is localized almost entirely in the cytoplasm. The micrograph in b shows the cell in prophase of mitosis, and the labeled cyclin B1 is concentrated in the cell nucleus. The basis for this change in localization is discussed in the text. (FROM PAUL CLUTE AND JONATHAN PINES, NATURE CELL BIOL. 1:83, 1999. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

drives the cell into mitosis. Cdks do not always stimulate activities, but can also inhibit inappropriate events. For example, cyclin B1–Cdk1 activity during G2 prevents a cell from rereplicating DNA that has already been replicated earlier in the cell cycle (page 560). This helps ensure that each region of the genome is replicated once and only once per cell cycle. The roles of the various cyclin–Cdk complexes shown in Figure 14.8a have been determined by a wide range of biochemical studies carried out on mammalian cells for more than two decades. Over the past few years, the roles of these proteins have been reexamined in knockout mice, with some surprising results (Figure 14.8b). As expected, the phenotype of a particular knockout mouse depends on the gene that has been eliminated. Mice that are unable to synthesize Cdk1, cyclin B1, cyclins E1 and E2, or cyclin A2, die as early embryos, suggesting that the proteins encoded by these genes are essential for a normal cell cycle. In contrast, a mouse embryo that lacks the genes encoding all of the other cell cycle Cdks (namely, Cdks 2, 4, and 6) is capable of developing to a stage with fully formed organs, although the animal does not survive to birth (Figure 14.8b). Cells taken from such embryos are capable of proliferating in culture, though more slowly than normal cells. This finding indicates that, as in yeast, Cdk1 is the only Cdk required to drive a mammalian cell through all of the stages of the cell cycle. In other words, even though the other Cdks are normally expressed at specific times during the mammalian cell cycle, Cdk1 is able to “cover” for their absence, ensuring that all of the required substrates are phosphorylated at each stage of the cell cycle. This is a classical case of redundancy, in which a protein is able to carry out functions that it would not normally perform. Still, the absence of one of these “nonessential” cyclins or Cdks typically results in distinct cell cycle abnormalities, at least in certain types of cells. Mice lacking a gene for cyclin D1, for example, are smaller than control animals, which stems from a reduction in the level of cell division throughout the body. In addition, cyclin D1-deficient animals display a

Decreased haematopoietic precursors and cardiomyocytes

No cell division (two-cell embryo)

Cyclin B/A + Cdk1

G1

Decreased cardiomyocytes +/+

Cdk1 Cdk2 –/– Stop Cdk4 –/– –/– Cdk6

Cyclin D's + Cdk4 Cdk6

S Cyclin A + Cdk2

Cyclin E + Cdk2

E1.5

E12.5

Male and female sterility

+/+

Cdk1 Cdk2+/+ Stop Cdk4 –/– Cdk6 –/–

E16.5

+/+

Cdk1 Cdk2 –/– Stop Cdk4 –/– Cdk6+/+

PO

Cdk1+/+ Cdk2 –/– Cdk4+/+ Cdk6 –/– Adult

(b)

Figure 14.8 Cyclin-Cdks in the mammalian cell cycle. (a) Combinations between various cyclins and cyclin-dependent kinases at different stages in the mammalian cell cycle. Cdk activity during early G1 is very low, which promotes the formation of prereplication complexes at the origins of replication (see Figure 13.20). By mid-G1, Cdk activity is evident due to the association of Cdk4 and Cdk6 with the D-type cyclins (D1, D2, and D3). Among the substrates for these Cdks is an important regulatory protein called pRb (Section 16.3, Figure 16.12). The phosphorylation of pRb leads to the transcription of a number of genes, including those that code for cyclins E and A, Cdk1, and proteins involved in replication. The G1–S transition, which includes the initiation of replication, is driven by the activity of the cyclin E–Cdk2 and cyclin A–Cdk2 complexes. The transition from G2 to M and passage through early M is driven by the sequential activity of cyclin A–Cdk1 and cyclin B1–Cdk1 complexes, which phosphorylate such diverse substrates as cytoskeletal proteins, histones, and proteins of the nuclear envelope. (The mammalian Cdk1 kinase is equivalent to the fission yeast cdc2 kinase, and its inhibition and activation are similar to

particular lack of cell proliferation during development of the retina. Mice lacking Cdk4 develop without insulin-producing cells in their pancreas. Mice lacking Cdk2 appear to develop normally but exhibit specific defects during meiosis (Figure 14.8b), which reinforces the important differences in the regulation of mitotic and meiotic divisions.

Other symptoms of the disease include unsteady posture (ataxia) resulting from degeneration of nerve cells in the cerebellum, permanently dilated blood vessels (telangiectasia) in the face and elsewhere, susceptibility to infection, and cells with an abnormally high number of chromosome aberrations. The basis for the first two symptoms has yet to be determined.

that indicated in Figure 14.6.) (b) The effects on mouse development of the deletion of genes (shown in red) encoding various Cdks. Of the four primary mammalian Cdks, only Cdk1 is absolutely required for cell division. Embryos that express only Cdk1 die during the course of embryonic development. Mice expressing both Cdk1 and Cdk4 develop into adults that are sterile, owing to defects in the meiotic cell cycles. E, embryonic day number; P, postnatal day number. (A: C. G. SHERR, CELL 73:1060, 1993; CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/ TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER. H. A. COLLER, NATURE REVS. MOL. CELL BIOL. 8:667, 2007. NATURE REVIEWS MOLECULAR CELL BIOLOGY BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/ TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER. B: MALUMBERS AND BARBACID, NAT REVS CANCER 9, 160, 2009 FIGURE 2. NATURE REVIEWS CANCER BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

normal cell is irradiated during the G1 phase of the cell cycle, it delays progression into S phase. Similarly, cells irradiated in S phase delay further DNA synthesis, whereas cells irradiated in G2 delay entry into mitosis. Studies of this type carried out in yeast gave rise to a concept, formulated by Leland Hartwell and Ted Weinert in 1988, that cells possess checkpoints as part of their cell cycle. Checkpoints are surveillance mechanisms that halt the progress of the cell cycle if (1) any of the chromosomal DNA is damaged, or (2) certain critical processes, such as DNA replication during S phase or chromosome alignment during M phase, have not been properly completed. Checkpoints ensure that each of the various events that make up the cell cycle occurs accurately and in the proper order. Many of the proteins of the checkpoint machinery have no role in normal cell cycle events and are only called into action when an abnormality appears. In fact, the genes encoding several checkpoint proteins were first identified in mutant yeast cells that continued their progress through the cell cycle, despite suffering DNA damage or other abnormalities that caused serious defects. Checkpoints are activated throughout the cell cycle by a system of sensors that recognize DNA damage or cellular abnormalities. If a sensor detects the presence of a defect, it triggers a response that temporarily arrests further cell cycle

14.1 The Cell Cycle

Checkpoints, Cdk Inhibitors, and Cellular Responses Ataxia-telangiectasia (AT ) is an inherited recessive disorder characterized by a host of diverse symptoms, including a greatly increased risk for certain types of cancer.2 During the late 1960s—following the deaths of several individuals undergoing radiation therapy—it was discovered that patients with AT are extremely sensitive to ionizing radiation (page 567). So too are cells from these patients, which lack a crucial protective response found in normal cells. When normal cells are subjected to treatments that damage DNA, such as ionizing radiation or DNA-altering drugs, their progress through the cell cycle stops while the damage is repaired. If, for example, a 2

Decreased haematopoietic precursors

–/–

Cdk1 Cdk2+/+ Stop Cdk4+/+ Cdk6+/+

M G2

(a)

579

580

progress. The cell can then use the delay to repair the damage or correct the defect rather than continuing to the next stage. This is especially important because mammalian cells that undergo division with genetic damage run the risk of becoming transformed into a cancer cell. If the DNA is damaged beyond repair, the checkpoint mechanism can transmit a signal that leads either to (1) the death of the cell or (2) its conversion to a state of permanent cell cycle arrest (known as senescence). We have seen in numerous places in this text where the study of a rare human disease has led to a discovery of basic importance in cell and molecular biology. The cell’s DNA damage response provides another example of this path to discovery. The gene responsible for ataxia-telangiectasia (the ATM gene) encodes a protein kinase that is activated by certain DNA lesions, particularly double-stranded breaks (page 567). Remarkably, the presence of a single break in one of the cell’s DNA molecules is sufficient to cause rapid, large-scale activation of ATM molecules, causing cell cycle arrest. A related protein kinase called ATR is also activated by DNA breaks as well as other types of lesions, including those resulting from incompletely replicated DNA or UV irradiation. Both ATM and ATR are part of multiprotein complexes capable of binding to chromatin that contains damaged DNA. Once bound, ATM and ATR can phosphorylate a remarkable variety of proteins that participate in cell cycle checkpoints and DNA repair. How does a cell stop its progress from one stage of the cell cycle to the next? We will briefly examine two well-studied pathways available to mammalian cells to arrest their cell cycle in response to DNA damage.

considered as a sensor of DNA breaks. MRN recruits and activates ATM, which phosphorylates and activates another checkpoint kinase called Chk2 (step b). Chk2 in turn phosphorylates a transcription factor (p53) (step c),

ssDNA-protein complex

Chapter 14 Cellular Reproduction

diation, ATR kinase is activated and the cell arrests in G2. ATR kinase molecules are thought to be recruited to sites of protein-coated, single-stranded DNA (step 1, Figure 14.9), such as those present as UV-damaged DNA is repaired (Figure 13.25). ATR phosphorylates and activates a checkpoint kinase, called Chk1 (step 2), which in turn phosphorylates Cdc25 on a particular serine residue (step 3), making the Cdc25 molecule a target for a special adaptor protein that binds to Cdc25 in the cytoplasm (steps 4, 5). This interaction inhibits Cdc25’s phosphatase activity and prevents it from being reimported into the nucleus. As discussed on page 577, Cdc25 normally plays a key role in the G2/M transition by removing inhibitory phosphates from Cdk1. Thus, the absence of Cdc25 from the nucleus leaves the Cdk in an inactive state (step 6) and the cell arrested in G2. 2. Damage to DNA also leads to the synthesis of proteins that directly inhibit the cyclin–Cdk complex that drives the cell cycle. For example, cells exposed to ionizing radiation in G1 synthesize a protein called p21 (molecular mass of 21 kDa) that inhibits the kinase activity of the G1 Cdk. This prevents the cells from phosphorylating key substrates and from entering S phase. ATM is involved in this checkpoint mechanism. In this particular DNA-damage response, the breaks in DNA that are caused by ionizing radiation serve as sites for the recruitment of a protein complex termed MRN (step a, Figure 14.9). MRN can be

Ionizing radiation

Inactive

ATM

G2

Chk1

G1

2 Chk1

Active p53

Unstable

c

3 Cdc25

Stable

p53

P

P 6

4

d

Inactive Cdk

P

CELL CYCLE ARREST

P

Cytoplasm Adaptor protein (14–3–3σ)

Inactive

b Chk2

Cdc25

Cdc25

Chk2

P

P

Active

MRN complex

a

1

ATR

5

1. If a cell preparing to enter mitosis is subjected to UV irra-

Nucleus

Ultraviolet radiation

p21 gene

P

DNA

p53

p21 mRNA

Active e p21

Cdk f

Inactive Cdk p21

CELL CYCLE ARREST

Figure 14.9 Models for the mechanism of action of two DNAdamage checkpoints. ATM and ATR are protein kinases that become activated following specific types of DNA damage. Each of these proteins acts through checkpoint signaling pathways that lead to cell cycle arrest. ATM becomes activated in response to double-strand breaks, which are detected by the MRN protein complex (step a). ATR, on the other hand, becomes activated by protein-coated ssDNA (step 1) that forms when replication forks become stalled or the DNA is being repaired after various types of damage. In the G2 pathway shown here, ATR phosphorylates and activates the checkpoint kinase Chk1 (step 2), which phosphorylates and inactivates the phosphatase Cdc25 (step 3), which normally shuttles between the nucleus and cytoplasm (step 4). Once phosphorylated, Cdc25 is bound by an adaptor protein in the cytoplasm (step 5) and cannot be reimported into the nucleus, which leaves the Cdk in its inactivated, phosphorylated state (step 6). In the G1 pathway shown here, ATM phosphorylates and activates the checkpoint kinase Chk2 (step b), which phosphorylates p53 (step c). p53 is normally very short-lived, but phosphorylation by Chk2 stabilizes the protein, enhancing its ability to activate p21 transcription (step d). Once transcribed and translated (step e), p21 directly inhibits the Cdk (step f ). Many other proteins, including histone-modifying enzymes, chromatin remodeling complexes, and histone variants are involved in mediating the response to DNA damage but are not discussed (see Curr. Opin. Cell Biol. 21:245, 2009; Nature Revs. Mol. Cell Biol. 10:243, 2009; Nature Cell Biol. 13:1161, 2011; and Genes Develop. 25:409, 2011.

581

(a)

(b)

(c)

Figure 14.10 p27: a Cdk inhibitor that arrests cell cycle progression. (a) Three-dimensional structure of a complex between p27 and cyclin A–Cdk2. Interaction with p27 alters the conformation of the Cdk catalytic subunit, inhibiting its protein kinase activity. (b) A pair of littermates at 12 weeks of age. In addition to possessing different genes for coat color, the mouse with dark fur has been genetically engineered to lack both copies of the p27 gene (denoted as p27-/-), which accounts for its larger size. (c) Comparison of the thymus glands

from a normal (left) and a p27-/- mouse (right). The gland from the p27 knockout mouse is much larger owing to an increased number of cells. (A: FROM ALICIA A. RUSSO ET AL., NATURE 382:327, 1996, FIG. 2A. COURTESY OF NIKOLA PAVLETICH, HOWARD HUGHES MEDICAL INSTITUTE; REPRINTED BY PERMISSION OF MACMILLAN PUBLISHERS LIMITED; B,C: FROM KEIKO NAKAYAMA ET AL., COURTESY OF KEI-ICHI NAKAYAMA, CELL 85:710, 711, 1996; WITH PERMISSION FROM ELSEVIER.)

which leads to the transcription and translation of the p21 gene (steps d and e) and subsequent inhibition of Cdk (step f ). Approximately 50 percent of all human tumors show evidence of mutations in the gene that encodes p53, which reflects its importance in the control of cell growth. The role of p53 is discussed at length in Chapter 16.

2. Describe how [3H]thymidine and autoradiography can be used to determine the length of the various periods of the cell cycle. 3. What is the effect of fusing a G1-phase cell with one in M; of fusing a G2- or S-phase cell with one in M? 4. How does the activity of MPF vary throughout the cell cycle? How is this correlated with the concentration of cyclins? How does the cyclin concentration affect MPF activity? 5. What are the respective roles of CAK, Wee1, and Cdc25 in controlling Cdk activity in fission yeast cells? What is the effect of mutations in the wee1 or cdc25 genes in these cells? 6. What is meant by a cell cycle checkpoint? What is its importance? How does a cell stop its progress at one of these checkpoints?

REVIEW 1. What is the cell cycle? What are the stages of the cell cycle? How does the cell cycle vary among different types of cells?

14.2 | M Phase: Mitosis and Cytokinesis Whereas our understanding of cell cycle regulation rests largely on genetic studies in yeast, our knowledge of M phase is based on more than a century of microscopic and biochemical research on animals and plants.The name “mitosis” comes from the Greek word mitos, meaning “thread.” The name was coined in 1882 by the German biologist Walther Flemming to describe the threadlike chromosomes that mysteriously appeared in animal cells just before they divided in two. The beauty and precision of cell division is best appreciated by watching a time-lapse video of the process (e.g., www.bio.unc.edu/faculty/salmon/lab/mitosis/ mitosismovies.html) rather than reading about it in a textbook.

14.2 M Phase: Mitosis and Cytokinesis

p21 is only one of at least seven known Cdk inhibitors. The interaction between a related Cdk inhibitor (p27) and one of the cyclin–Cdk complexes is shown in Figure 14.10a. In this structural model, the p27 molecule drapes itself across both subunits of the cyclin A–Cdk2 complex, changing the conformation of the catalytic subunit and inhibiting its kinase activity. In many cells, p27 must be phosphorylated and then degraded before progression into S phase can occur. Cdk inhibitors, such as p21 and p27, are also active in cell differentiation. Just before cells begin to differentiate— whether into muscle cells, liver cells, blood cells, or some other type—they typically withdraw from the cell cycle and stop dividing. Cdk inhibitors are thought to either allow or directly induce cell cycle withdrawal. Just as the functions of specific Cdks and cyclins have been studied in knockout mice, so too have their inhibitors. Knockout mice that lack the p27 gene show a distinctive phenotype: they are larger than normal (Figure 14.10b), and certain organs, such as the thymus gland and spleen, contain a significantly greater number of cells than those of a normal animal (Figure 14.10c). In normal mice, the cells of these particular organs synthesize relatively high levels of p27, and it is presumed that the absence of this protein in the p27-deficient animals allows the cells to divide several more times before they differentiate.

582 Prophase 1. Chromosomal material condenses to form compact mitotic chromosomes. Chromosomes are seen to be composed of two chromatids attached together at the centromere. 2. Cytoskeleton is disassembled, and mitotic spindle is assembled. 3. Golgi complex and ER fragment. Nuclear envelope disperses. Prometaphase 1. Chromosomal microtubules attach to kinetochores of chromosomes. 2. Chromosomes are moved to spindle equator.

Metaphase 1. Chromosomes are aligned along metaphase plate, attached by chromosomal microtubules to both poles.

Anaphase 1. Centromeres split, and chromatids separate. 2. Chromosomes move to opposite spindle poles. 3. Spindle poles move farther apart.

Telophase 1. Chromosomes cluster at opposite spindle poles.

Chapter 14 Cellular Reproduction

2. Chromosomes become dispersed. 3. Nuclear envelope assembles around chromosome clusters. 4. Golgi complex and ER reforms. 5. Daughter cells formed by cytokinesis.

Figure 14.11 The stages of mitosis in an animal cell (left drawings) and a plant cell (right photos). (MICROGRAPHS COURTESY OF ANDREW BAJER.)

583

Mitosis is a process of nuclear division in which the replicated DNA molecules of each chromosome are faithfully segregated into two nuclei. Mitosis is usually accompanied by cytokinesis, a process by which a dividing cell splits in two, partitioning the cytoplasm into two cellular packages. The two daughter cells resulting from mitosis and cytokinesis possess a genetic content identical to each other and to the mother cell from which they arose. Mitosis, therefore, maintains the chromosome number and generates new cells for the growth and maintenance of an organism. Mitosis can take place in either haploid or diploid cells. Haploid mitotic cells are found in fungi, plant gametophytes, and a few animals (including male bees known as drones). Mitosis is a stage of the cell cycle when the cell devotes virtually all of its energy to a single activity—chromosome segregation. As a result, most metabolic activities of the cell, including transcription and translation, are curtailed during mitosis, and the cell becomes relatively unresponsive to external stimuli. We have seen in previous chapters how much can be learned about the factors responsible for a particular process by studying that process outside of a living cell (page 276). Our understanding of the biochemistry of mitosis has been greatly aided by the use of extracts prepared from frog eggs. These extracts contain stockpiles of all the materials (histones, tubulin, etc.) necessary to support mitosis. When chromatin or whole nuclei are added to the egg extract, the chromatin is compacted into mitotic chromosomes, which are segregated by a mitotic spindle that assembles spontaneously within the cell-free mixture. In many experiments, the role of a particular protein in mitosis can be studied by removing that protein from the egg extract by addition of an antibody (immunodepletion) and determining whether the process can continue in the absence of that substance (see Figure 14.21 for an example). Mitosis is generally divided into five stages (Figure 14.11), prophase, prometaphase, metaphase, anaphase, and telophase, each characterized by a particular series of events. Keep in mind that each of these stages represents a segment of a continuous process; the division of mitosis into arbitrary phases is done only for the sake of discussion and experimentation.

Prophase

Formation of the Mitotic Chromosome The nucleus of an interphase cell contains tremendous lengths of chromatin fibers. The extended state of interphase chromatin is ideally suited for the processes of transcription and replication but not for segregation into two daughter cells. Before segregating its chromosomes, a cell converts them into much shorter, thicker structures by a remarkable process of chromosome compaction (or chromosome condensation), which occurs during early prophase (Figures 14.11 and 14.12). As described on page 496, the chromatin of an interphase cell is organized into fibers approximately 30 nm in diameter.

Although there is debate on this issue, mitotic chromosomes are thought to be composed of similar types of fibers as seen by electron microscopic examination of whole chromosomes isolated from mitotic cells (Figure 14.13a). According to this viewpoint, chromosome compaction does not alter the nature of the chromatin fiber, but rather the way that the chromatin fiber is packaged. Treatment of mitotic chromosomes with solutions that solubilize the histones and the majority of the nonhistone proteins reveals a structural framework or scaffold that retains the basic shape of the intact chromosome (Figure 14.13b). Loops of DNA are attached at their base to the nonhistone proteins that make up this chromosome scaffold (shown at higher magnification in Figure 12.15). Research on chromosome compaction has focused on an abundant multiprotein complex called condensin. The proteins of condensin were discovered by incubating nuclei in frog egg extracts and identifying those proteins that associated with the chromosomes as they underwent compaction. Removal of condensin from the extracts prevented normal chromosome compaction. How is condensin involved in such dramatic changes in chromatin architecture? There is very little data available from in vivo studies to answer this question, but there is considerable speculation. Supercoiled DNA occupies a much smaller volume than relaxed DNA (see Figure 10.12), and studies suggest that DNA supercoiling plays a key role in compacting a chromatin fiber into the tiny volume occupied by a mitotic chromosome. In the presence of a topoisomerase and ATP, condensin is able to bind to DNA in vitro and curl the DNA into positively supercoiled loops. This finding fits nicely with the observation that chromosome compaction at prophase requires topoiso-

14.2 M Phase: Mitosis and Cytokinesis

During the first stage of mitosis, that of prophase, the duplicated chromosomes are prepared for segregation and the mitotic machinery is assembled.

Figure 14.12 Prophase nuclear morphology. Light-optical section through two mouse cell nuclei in prophase, recorded with superresolution 3D-structured illumination microscopy (3D-SIM). Condensed chromosomes are shown in red, the nuclear envelope in blue and microtubules, in green. Scale bar is 5␮m. (COURTESY OF LOTHAR SCHERMELLEH.)

584

merase II, which along with condensin is present as part of the mitotic chromosome scaffold (Figure 14.13b). A speculative model for condensin action is shown in Figure 14.14. Con-

densin is activated at the onset of mitosis by phosphorylation of several of its subunits by the cyclin–Cdk responsible for driving cells from G2 into mitosis. Thus condensin is one of the targets through which Cdks are able to trigger cell cycle activities. The subunit structure of a V-shaped condensin molecule is shown in the right inset of Figure 14.14. As the result of compaction, the chromosomes of a mitotic cell appear as distinct, rod-like structures. Close examination of mitotic chromosomes reveals each of them to be composed of two mirror-image, “sister” chromatids (Figure 14.13a). Sister chromatids are a result of replication in the previous interphase. Prior to replication, the DNA of each interphase chromosome becomes associated at sites along its length with a multiprotein complex called cohesin (Figure 14.14). Following replication, cohesin holds the two sister chromatids together continuously through G2 and into mitosis when they are ulti-

Scc3

Scc1

Smc3

Smc4 Smc2 Smc1

(a) Cohesin

Condensin

Cohesin

DNA Condensin

Chapter 14 Cellular Reproduction

Scaffold

(b)

Figure 14.13 The mitotic chromosome. (a) Electron micrograph of a whole-mount preparation of a human mitotic chromosome. The structure is seen to be composed of a knobby fiber 30 nm in diameter, which is similar to that found in interphase chromosomes. (b) Appearance of a mitotic chromosome after the histones and most of the nonhistone proteins have been removed. The residual proteins form a scaffold from which loops of DNA are seen to emerge (the DNA loops are shown more clearly in Figure 12.15). (A: COURTESY OF GUNTHER F. BAHR, ARMED FORCES INSTITUTE OF PATHOLOGY, WASHINGTON, D.C.; B: FROM JAMES R. PAULSON AND ULRICH K. LAEMMLI, CELL 12:820, 1977, WITH PERMISSION FROM ELSEVIER.)

Interphase

Prophase

Figure 14.14 Model for the roles of condensin and cohesin in the formation of mitotic chromosomes. Just after replication, the DNA helices of a pair of sister chromatids would be held in association by cohesin molecules that encircled the sister DNA helices, as shown at the left of the drawing. As the cell entered mitosis, the compaction process would begin, aided by condensin molecules, as shown in the right part of the drawing. In this model, condensin brings about chromosome compaction by forming a ring around supercoiled loops of DNA within chromatin. Cohesin molecules would continue to hold the DNA of sister chromatids together. It is proposed (but not shown in this drawing), that cooperative interactions between condensin molecules would then organize the supercoiled loops into larger coils, which are then folded into a mitotic chromosome fiber. The top left and right insets show the subunit structure of an individual cohesin and condensin complex, respectively. Both complexes are built around a pair of SMC subunits. Each of the SMC polypeptides folds back on itself to form a highly elongated antiparallel, coiled coil with an ATP-binding globular domain where the N- and C-termini come together. Cohesin and condensin also have two or three non-SMC subunits that complete the ring-like structure of these proteins.

585

(a)

(b)

human cell. The DNA is stained blue, the kinetochores are green, and cohesin is red. At this stage of mitosis, cohesin has been lost from the arms of the sister chromatids but remains concentrated at the centromeres where the two sisters are tightly joined. (A: ANDREW SYRED/PHOTO RESEARCHERS, INC.; B: BY S. HAUF AND JAN-MICHAEL PETERS, NATURE CELL BIOL. 3:E17, 2001 FIG. 1C. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

mately separated. As indicated in the insets of Figure 14.14, condensin and cohesin have a similar structural organization. A number of experiments support the hypothesis that the cohesin ring encircles two sister DNA molecules as shown in both the left and right portions of Figure 14.14. In vertebrates, cohesin is released from the chromosomes in two distinct stages. Most of the cohesin dissociates from the arms of the chromosomes as they become compacted during prophase. Dissociation is induced by phosphorylation of cohesin subunits by two important mitotic enzymes called Polo-like kinase and Aurora B kinase. In the wake of this event, the chromatids of each mitotic chromosome are held relatively loosely along their extended arms, but much more tightly at their centromeres (Figure 14.13a and Figure 14.15). Cohesin remains at the centromeres because of the presence there of a phosphatase that removes any phosphate groups added to the protein by the kinases. Release of cohesin from the centromeres is normally delayed until anaphase as described on page 592. If the phosphatase is experimentally inactivated, sister chromatids separate from one another prematurely prior to anaphase.

mitotic chromosome reveals the presence of a proteinaceous, button-like structure, called the kinetochore, at the outer surface of the centromere of each chromatid (Figure 14.16a,b). Most of the proteins that make up the kinetochore assemble at the centromere at early prophase. Kinetochore proteins are thought to be recruited to the centromere because of the presence there of the novel nucleosomes containing the histone variant CENP-A (page 509). As will be apparent shortly, the kinetochore functions as (1) the site of attachment of the chromosome to the dynamic microtubules of the mitotic spindle (as in Figure 14.30), (2) the residence of several motor proteins involved in chromosome motility (Figure 14.16c), and (3) a key component in the signaling pathway of an important mitotic checkpoint (see Figure 14.31). A question of great interest to scientists studying kinetochores is how these structures are able to maintain their attachment to microtubules that are continually growing and shrinking at their plus end. To maintain this type of “floating grip,” the coupler would have to move with the end of the microtubule as subunits were added or removed. Figure 14.16c depicts two types of proteins that have been implicated as possible linkers of a kinetochore to dynamic microtubules, namely, motor proteins and a rod-shaped protein complex called Ndc80. Ndc80 is an essential kinetochore component that forms fibrils that appear to reach out and bind the surface of the adjacent microtubule. Cells lacking any of the four proteins that make up the Ndc80 complex exhibit severe spindle attachment defects.

Centromeres and Kinetochores The most notable landmark on a mitotic chromosome is an indentation or primary constriction, which marks the position of the centromere (Figure 14.13a). The centromere is the residence of highly repeated DNA sequences (see Figure 10.19) that serve as the binding sites for specific proteins. Examination of sections through a

14.2 M Phase: Mitosis and Cytokinesis

Figure 14.15 Each mitotic chromosome is comprised of a pair of sister chromatids connected to one another by the protein complex cohesin. (a) Colorized scanning electron micrograph of several metaphase chromosomes showing the paired identical chromatids associated loosely along their length and joined tightly at the centromere. The chromatids are not split apart from one another until anaphase. (b) Fluorescence micrograph of a metaphase chromosome in a cultured

586 Outer kinetochore microtubule binding microtubule motor activity signal transduction

Plus end of microtubules where subunits are added or lost

Kinetochore Microtubule

Outer kinetochore plate

Pericentromeric Heterochromatin

Ndc80

Microtubules Centromeric heterochromatin

+

_

Inner kinetochore centromere replication chromatin interface kinetochore formation

Chapter 14 Cellular Reproduction

(a)

0.2 µm

Inner plate Fibrous corona Interzone Outer plate (b)

Microtubule (c)

CENP-E Dynein Corona fiber Depolymerase

Figure 14.16 The kinetochore. (a) Electron micrograph of a section through one of the kinetochores of a mammalian metaphase chromosome, showing its three-layered (trilaminar) structure. Microtubules of the mitotic spindle can be seen to terminate at the kinetochore. (b) Schematic representation of the kinetochore, which contains an electron-dense inner and outer plate separated by a lightly staining interzone. Proposed functions of the inner and outer plates are indicated in part a. The inner plate contains a variety of proteins attached to the centromeric heterochromatin of the chromosome. Associated with the outer plate is the fibrous corona, which binds motor proteins involved in chromosome movement. (c) A schematic model showing a proposed disposition of several of the proteins found at the outer surface of the kinetochore. Among the motor proteins associated with the kinetochore, cytoplasmic dynein moves

toward the minus end of a microtubule, whereas CENP-E moves toward the plus end. These motors may also play a role in tethering the microtubule to the kinetochore. The protein labeled “depolymerase” is a member of the kinesin superfamily that functions in depolymerization of microtubules rather than motility. In this drawing, the depolymerases are in an inactive state (the microtubule is not depolymerizing). Ndc80 is a protein complex consisting of four different protein subunits that form a 57 nm-long, rod-shaped molecule extending outward from the body of the kinetochore. Globular domains at either end of the complex mediate attachment to the microtubule and kinetochore. These Ndc80 fibrils have been implicated as couplers of the kinetochore to the plus end of a dynamic microtubule. (A: FROM DON W. CLEVELAND, UCSD, ET AL., CELL 112:408, 2003 FIG. 1C, WITH PERMISSION FROM ELSEVIER. IMAGE COURTESY OF KEVIN SULLIVAN.)

Formation of the Mitotic Spindle We discussed in Chapter 9 how microtubule assembly in animal cells is initiated by a special microtubule-organizing structure, the centrosome (page 339). As a cell progresses past G2 and into mitosis, the microtubules of the cytoskeleton undergo sweeping disassembly in preparation for their reassembly as components of a complex, micro-sized machine called the mitotic spindle. The rapid disassembly of the interphase cytoskeleton is thought to be accomplished by the inactivation of proteins that stabilize microtubules (e.g., microtubule-associated proteins, or MAPs) and the activation of proteins that destabilize these polymers. To understand the formation of the mitotic spindle, we need to first examine the centrosome cycle as it progresses in concert with the cell cycle (Figure 14.17a). When an animal cell exits mitosis, the cytoplasm contains a single centrosome containing two centrioles situated at right angles to one another. Even before cytokinesis has been completed, the two centrioles of each daughter cell lose their close association to one another (they are said to “disengage”). This event is triggered by the enzyme separase, which becomes activated late in mitosis (page 592) and cleaves a proteinaceous link holding the centrioles together. Later, as DNA replication begins in the nucleus at the onset of S phase, each centriole of the centrosome initiates its “replication” in the cytoplasm. The process begins with the

appearance of a small procentriole next to each preexisting (maternal) centriole and oriented at right angles to it (Figure 14.17b). Subsequent microtubule elongation converts each procentriole into a full-length daughter centriole. At the beginning of mitosis, the centrosome splits into two adjacent centrosomes, each containing a pair of mother–daughter centrioles. The initiation of centrosome duplication at the G1–S transition is normally triggered by phosphorylation of a centrosomal protein by Cdk2, the same agent responsible for the onset of DNA replication (Figure 14.8). Centrosome duplication is a tightly controlled process so that each mother centriole produces only one daughter centriole during each cell cycle. The formation of additional centrioles can lead to abnormal cell division and may contribute to the development of cancer (Figure 14.17c). The first stage in the formation of the mitotic spindle in a typical animal cell is the appearance of microtubules in a “sunburst” arrangement, or aster (Figure 14.18), around each centrosome during early prophase. As discussed in Chapter 9, microtubules grow by addition of subunits to their plus ends, while their minus ends remain associated with the pericentriolar material (PCM) of the centrosome (page 339). Phosphorylation of proteins of the PCM by Polo-like kinase is thought to play a key role in stimulating nucleation of spindle microtubules during prophase. The process of aster formation is

587

Early G1 phase

S phase

(b)

Mitosis (a)

Figure 14.17 The centrosome cycle of an animal cell. (a) During G1, the centrosome contains a single pair of centrioles that are no longer as tightly associated as they were during mitosis. During S phase, daughter procentrioles form adjacent to maternal centrioles so that two pairs of centrioles become visible within the centrosome (see b). The daughter procentrioles continue to elongate during G2 phase, and at the beginning of mitosis, the centrosome splits, with each pair of centrioles becoming part of its own centrosome. As they separate, the centrosomes organize the microtubule fibers that make up the mitotic spindle. (b) The centrosome of this cell is seen to contain two mother centrioles, each with an associated daughter procentriole (arrows). (c) This mouse mammary cancer cell contains more than the normal complement of two centrosomes (red) and has assembled a multipolar spindle apparatus (green). Additional centrosomes lead to chromosome missegregation and abnormal numbers of chromosomes (blue), which are characteristic of malignant cells. (A: AFTER D. R. KELLOGG ET AL., REPRODUCED WITH PERMISSION FROM THE ANNUAL REVIEW OF BIOCHEMISTRY, VOL. 63; COPYRIGHT 1994, ANNUAL REVIEW OF BIOCHEMISTRY BY ANNUAL REVIEWS. REPRODUCED WITH PERMISSION OF ANNUAL REVIEWS IN THE FORMAT REPUBLISH IN A BOOK VIA COPYRIGHT; B: FROM JEROME B. RATTNER AND STEPHANIE G. PHILLIPS, J. CELL BIOL. 57:363, 1973, FIG. 4. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS; C: COURTESY OF THEA GOEPFERT AND W. R. BRINKLEY, BAYLOR COLLEGE OF MEDICINE, HOUSTON, TX.)

Astral microtubules

Centrosome

Figure 14.18 Formation of the mitotic spindle. During prophase, as the chromosomes are beginning to condense, the centrosomes move apart from one another as they organize the bundles of microtubules that form the mitotic spindle. This micrograph shows a cultured newt lung cell in early prophase that has been stained with fluorescent antibodies against tubulin, which reveals the distribution of the cell’s microtubules (green). The microtubules of the developing mitotic spindle are seen to emanate as asters from two sites within the cell. These sites correspond to the locations of the two centrosomes that are moving toward opposite poles at this stage of prophase. The centrosomes are situated above the cell nucleus, which appears as an unstained dark region. (FROM JENNIFER WATERS, RICHARD W. COLE, AND CONLY L. RIEDER, J. CELL BIOL. 122:364, 1993, FIG. 2. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS. COURTESY OF CONLY L. RIEDER.)

14.2 M Phase: Mitosis and Cytokinesis

followed by separation of the centrosomes from one another and their subsequent movement around the nucleus toward opposite ends of the cell. Centrosome separation is driven by motor proteins associated with the adjacent microtubules. As the centrosomes separate, the microtubules stretching between them increase in number and elongate (Figure 14.18). Eventually, the two centrosomes reach points opposite one another, thus establishing the two poles of a bipolar mitotic spindle (as in Figure 14.17a). Following mitosis, one centrosome will be distributed to each daughter cell. A number of different types of animal cells (including those of the early mouse embryo) lack centrosomes, as do the cells of higher plants, yet all of these cells construct a bipolar mitotic spindle and undergo a relatively typical mitosis. Functional mitotic spindles can even form in mutant Drosophila cells that lack centrosomes or in mammalian cells in which the centrosome has been experimentally removed. In all of these cases,

(c)

588 + +

+

+ +

+

+

_

+ _ Motor protein _

_ _

+ +

Figure 14.19 Formation of a spindle pole in the absence of centrosomes. In this model, each motor protein has multiple heads, which are bound to different microtubules. The movement of these motor proteins causes the minus ends of the microtubules to converge to form a distinct spindle pole. This type of mechanism is thought to facilitate the formation of spindle poles in the absence of centrosomes but may also play a role when centrosomes are present. (FROM A. A. HYMAN AND E. KARSENTI, CELL 84:406, 1996; BY PERMISSION OF CELL PRESS. CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

Chapter 14 Cellular Reproduction

the microtubules of the mitotic spindle are nucleated near the chromosomes rather than at the poles where centrosomes would normally reside. Then, once they have polymerized, the minus ends of the microtubules are brought together (i.e., focused) at each spindle pole through the activity of motor proteins (Figure 14.19). The chapter-opening photograph on page 572 shows a bipolar spindle that has formed in a frog egg extract through the activity of microtubule motors. These types of experiments suggested that cells possess two fundamentally different mechanisms—one centrosome-dependent and the other centrosome-independent—to achieve the same end result. Recent studies have indicated that both pathways to spindle formation operate simultaneously in the same cell and that even cells with functional centrosomes nucleate a significant fraction of their spindle microtubules at the chromosomes. The Dissolution of the Nuclear Envelope and Partitioning of Cytoplasmic Organelles In most eukaryotic cells, the mitotic spindle is assembled in the cytoplasm and the chromosomes are compacted in the nucleoplasm. Interaction between the spindle and chromosomes is made possible by the breakdown of the nuclear envelope at the end of prophase. The three major components of the nuclear envelope—the nuclear pore complexes, nuclear lamina, and nuclear membranes—are disassembled in separate processes. All of these processes are thought to be initiated by phosphorylation of key substrates by mitotic kinases, particularly cyclin B–Cdk1. The nuclear pore complexes are disassembled as the interactions between nucleoporin subcomplexes are disrupted and the subcomplexes dissociate into the surrounding medium. The nuclear lamina is disassembled by depolymerization of the lamin filaments. The integrity of the nuclear membranes is first disrupted mechanically as holes are torn into the nuclear envelope by cytoplasmic dynein molecules associated with the outer nuclear membrane. The subsequent fate of the membranous portion of the nuclear envelope has been the subject of controversy. According to the classical view, the nuclear membranes are fragmented into a population of small vesicles that disperse throughout the mitotic cell. Alternatively, the membranes of the nuclear envelope may be absorbed into the membranes of the ER.

Some of the membranous organelles of the cytoplasm remain relatively intact through mitosis; these include mitochondria, lysosomes, and peroxisomes, as well as the chloroplasts of a plant cell. Considerable debate has been generated in recent years over the mechanism by which the Golgi complex and endoplasmic reticulum are partitioned during mitosis. According to one view, the contents of the Golgi complex become incorporated into the ER during prophase, and the Golgi complex ceases to exist briefly as a distinct organelle. According to an alternate view, the Golgi membranes become fragmented to form a distinct population of small vesicles that are partitioned between daughter cells. A third view based primarily on studies in algae and protists has the entire Golgi complex splitting in two, with each daughter cell receiving half of the original structure. Ultimately, we may learn that different types of cells or organisms utilize different mechanisms of Golgi inheritance. Our ideas about the fate of the ER have also changed. Recent studies on living, cultured mammalian cells suggest that the ER network remains relatively intact during mitosis. This view challenges earlier studies performed largely on eggs and embryos that suggested the ER undergoes extensive fragmentation during prophase.

Prometaphase The dissolution of the nuclear envelope marks the start of the second phase of mitosis, prometaphase, during which mitotic spindle assembly is completed and the chromosomes are moved into position at the center of the cell. The following discussion provides a generalized picture of the steps of prometaphase; many variations on these events have been reported. At the beginning of prometaphase, compacted chromosomes are scattered throughout the space that was the nuclear region (Figure 14.20a). As the microtubules of the spindle penetrate into the central region of the cell, the free (plus) ends of the microtubules are seen to grow and shrink in a dynamic fashion, as if they were “searching” for a chromosome. It is not certain whether searching is entirely random, as evidence suggests that microtubules may grow preferentially toward a site containing chromatin. Those microtubules that contact a kinetochore are “captured” and stabilized. A kinetochore typically makes initial contact with the sidewall of a microtubule rather than its end (step 1, Figure 14.20b). Once initial contact is made, some chromosomes move actively along the wall of the microtubule, powered by motor proteins located in the kinetochore. Soon, however, the kinetochore tends to become stably associated with the plus end of one or more spindle microtubules from one of the spindle poles (step 2). A chromosome that is attached to microtubules from only one spindle pole represents an unstable intermediate stage in the course of prometaphase. Eventually, the unattached kinetochore on the sister chromatid captures its own microtubules from the opposite spindle pole (step 3). It has also been reported that unattached kinetochores serve as nucleating sites for the assembly of microtubules. These microtubules grow out from the chromosome by incorporation of tubulin subunits at the kinetochore and, once assembled, they become incorporated into the mitotic spindle. Regardless of how it occurs, the two sister chromatids of each

589

mitotic chromosome ultimately become connected by their kinetochores to microtubules that extend from opposite poles. Observations in living cells indicate that prometaphase chromosomes associated with spindle microtubules are not moved directly to the center of the spindle but rather oscillate back and forth in both a poleward and antipoleward direction. Ultimately, the chromosomes of a prometaphase cell are moved by a process called congression toward the center of the mitotic spindle, midway between the poles (step 4, Figure 14.20b). The forces required for chromosome movements during prometaphase are generated by motor proteins associated with both the kinetochores and arms of the chromosomes (depicted in Figure 14.33a and discussed in the legend). Figure 14.21 shows the consequences of a deficiency of a chromosomal motor protein whose activity pushes chromosomes away from the poles.

(a)

4

Syntelic attachment (no tension)

Tension applied by microtubules

3a

Bi-oriented chromosome (under tension)

3

1 2

Mono-oriented chromosome (no tension) (b)

Figure 14.21 The consequence of a missing motor protein on chromosome alignment during prometaphase. The top micrograph shows a mitotic spindle that has assembled in a complete frog egg extract. The lower micrograph shows a mitotic spindle that has assembled in a frog egg extract that has been depleted of a particular kinesin-related protein called Kid that is present along the arms of prometaphase chromosomes. In the absence of this motor protein, the chromosomes fail to align at the center of the spindle and instead are found stretched along spindle microtubules and clustered near the poles. Kid normally provides force for moving chromosomes away from the poles (see Figure 14.33a). (FROM CELIA ANTONIO ET AL., CELL VOL. 102, COVER #4, 2000; WITH PERMISSION FROM ELSEVIER. COURTESY OF ISABELLE VERNOS.)

14.2 M Phase: Mitosis and Cytokinesis

Figure 14.20 Prometaphase. (a) Fluorescence micrograph of a cultured newt lung cell at the early prometaphase stage of mitosis, just after the nuclear envelope has broken. The microtubules of the mitotic spindle are now able to interact with the chromosomes. The mitotic spindle appears green after labeling with a monoclonal antibody against tubulin, whereas the chromosomes appear blue after labeling with a fluorescent dye. (b) Schematic view of some of the successive steps in chromosome-microtubule interactions during prometaphase. In step 1, a kinetochore has made contact with the sidewall of a microtubule and is capable of utilizing kinetochore-bound motors to slide in either direction along the microtubule. In step 2, a chromosome has become attached to the plus end of a microtubule from one spindle pole (an end-on attachment forming a mono-oriented chromosome). In step 3, the chromosome has become attached in an end-on orientation to microtubules from both poles (forming a bi-oriented chromosome). In step 4, the bi-oriented chromosome has been moved to the center of the cell and will become part of the metaphase plate. Chromosomes at this stage are under tension (as indicated by the space between the chromatids) due to the opposing pulling forces exerted by the microtubules from opposite poles. The chromosome in step 3a has both of its kinetochores attached to microtubules from the same spindle pole. This abnormal syntelic attachment is discussed on page 596. (A: COURTESY OF ALEXEY KHODJAKOV, WADSWORTH CENTER, NY.)

590 Rapid addition of tubulin Slow subunits depolymer– ization

Pole

Rapid loss of tubulin subunits

Kinetochores

Pole

Slow depolymer– ization

Slow depolymer– ization Slow polymer– ization Pole

Slow depolymer– ization

Slow polymer– ization Pole

Chapter 14 Cellular Reproduction

Figure 14.22 Microtubule behavior during formation of the metaphase plate. Initially, the chromosome is connected to microtubules from opposite poles that may be very different in length. As prometaphase continues, this imbalance is corrected as the result of the shortening of microtubules from one pole, due to the rapid loss of tubulin subunits at the kinetochore, and the lengthening of microtubules from the opposite pole, due to the rapid addition of tubulin subunits at the kinetochore. These changes are superimposed over a much slower polymerization and depolymerization apparent in the lower drawing that occur continually during prometaphase and metaphase, causing the subunits of the microtubule to move toward the poles in a process known as microtubule flux.

Microtubule dynamics also play a key role in facilitating chromosome movements during prometaphase. As the chromosomes congress toward the center of the mitotic spindle, the longer microtubules attached to one kinetochore are shortened, while the shorter microtubules attached to the sister kinetochore are elongated. These changes in microtubule length are thought to be governed by differences in pulling force (tension) on the two sister kinetochores. Shortening and elongation of microtubules occur primarily by loss or gain of subunits at the plus end of the microtubule (Figure 14.22). Remarkably, this dynamic activity occurs while the plus end of each microtubule remains attached to a kinetochore. Eventually, each chromosome moves into position along a plane at the center of the spindle, so that microtubules from each pole are equivalent in length. The movement of a wayward chromosome from a peripheral site near one of the poles to the center of the mitotic spindle during prometaphase is shown in the series of photos in Figure 14.23.

Metaphase Once all of the chromosomes have become aligned at the spindle equator—with one chromatid of each chromosome

connected by its kinetochore to microtubules from one pole and its sister chromatid connected by its kinetochore to microtubules from the opposite pole—the cell has reached the stage of metaphase (Figure 14.24). The plane of alignment of the chromosomes at metaphase is referred to as the metaphase plate. The mitotic spindle of the metaphase cell contains a highly organized array of microtubules that is ideally suited for the task of separating the duplicated chromatids positioned at the center of the cell. Functionally and spatially, the microtubules of the metaphase spindle of an animal cell can be divided into three groups (Figure 14.24): 1. Astral microtubules that radiate outward from the centro-

some into the region outside the body of the spindle. They help position the spindle apparatus in the cell and may help determine the plane of cytokinesis. 2. Chromosomal (or kinetochore) microtubules that extend between the centrosome and the kinetochores of the chromosomes. In mammalian cells, each kinetochore is attached to a bundle of 20–30 microtubules, which forms a spindle fiber (or k-fiber). During metaphase, the chromosomal microtubules exert a pulling force on the kinetochores. As a result, the chromosomes are maintained in the equatorial plane by a “tug-of-war” between balanced pulling forces exerted by chromosomal spindle fibers from opposite poles. These pulling forces generate deformations within the kinetochore and cause oscillations of the chromosomes situated at the metaphase plate. During anaphase, chromosomal microtubules are required for the movement of the chromosomes toward the poles. 3. Polar (or interpolar) microtubules that extend from the centrosome past the chromosomes. Polar microtubules from one centrosome overlap with their counterparts from the opposite centrosome. The polar microtubules form a structural basket that maintains the mechanical integrity of the spindle. As one watches films or videos of mitosis, metaphase appears as a stage during which the cell pauses for a brief period, as if all mitotic activities suddenly come to a halt. However, experimental analysis reveals that metaphase is a time when important events occur. Microtubule Flux in the Metaphase Spindle Even though there is no obvious change in length of the chromosomal microtubules as the chromosomes are aligned at the metaphase plate, studies using fluorescently labeled tubulin indicate that the microtubules exist in a highly dynamic state. Subunits are rapidly lost and added at the plus ends of the chromosomal microtubules, even though these ends are attached to the kinetochore. Thus, the kinetochore does not act like a cap at the end of the microtubule, blocking the entry or exit of terminal subunits, but rather it is the site of dynamic activity. Because more subunits are added to the plus end than are lost, there is a net addition of subunits at the kinetochore. Meanwhile, the minus ends of the microtubules experience a net loss, and thus subunits move along the chromosomal microtubules from the kinetochore toward the pole. This poleward flux of tubulin

591 A

B

C

D

E

F

Figure 14.23 The engagement of a chromosome during prometaphase and its movement to the metaphase plate. This series of photographs taken from a video recording shows the movements of the chromosomes of a newt lung cell over a period of 100 seconds during prometaphase. Although most of the cell’s chromosomes were nearly aligned at the metaphase plate at the beginning of the sequence, one of the chromosomes (arrow) had failed to become attached to spindle fibers

from both poles. The wayward chromosome has become attached to spindle fibers from opposite poles in B and then moves toward the spindle equator with variable velocity until it reaches a stable position in F. The position of one pole is indicated by the arrowhead in A. (FROM STEPHEN P. ALEXANDER AND CONLY L. RIEDER, J. CELL BIOL. 113:807, 1991 FIG. 1. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS. COURTESY CONLY. L. RIEDER.)

Astral spindle microtubules

Astral spindle microtubules Centriole

Chromosomal (kinetochore) spindle fibers

Figure 14.24 The mitotic spindle of an animal cell. Each spindle pole contains a pair of centrioles surrounded by amorphous pericentriolar material at which the microtubules are nucleated. Three types of spindle microtubules—astral, chromosomal, and polar spindle microtubules—are evident, and their functions are described in the

Centriole

Polar spindle microtubules

Pericentriolar material

text. All of the spindle microtubules, which can number in the thousands, have their minus ends pointed toward the centrosome. Although not shown here, spindles may also contain shorter microtubules that do not make contact with either a kinetochore or a spindle pole.

14.2 M Phase: Mitosis and Cytokinesis

Pericentriolar material

Chromosomes

592

Tubulin flux

Tubulin flux

Tubulin flux

Tubulin flux

Figure 14.25 Tubulin flux through the microtubules of the mitotic spindle at metaphase. Even though the microtubules appear stationary at this stage, injection of fluorescently labeled tubulin subunits indicates that the components of the spindle are in a dynamic state of flux. Subunits are incorporated preferentially at the kinetochores of the chromosomal microtubules and the equatorial ends of the polar microtubules, and they are lost preferentially from the minus ends of the microtubules in the region of the poles. Tubulin subunits move through the microtubules of a metaphase spindle at a rate of about 1 ␮m/min.

subunits in a mitotic spindle is indicated in the experiment illustrated in Figure 14.25. Loss of tubulin subunits at the poles is likely aided by a member of the kinesin-13 family of motor proteins whose function is to promote microtubule depolymerization rather than movement (page 336).

Anaphase

Chapter 14 Cellular Reproduction

Anaphase begins when the sister chromatids of each chromosome split apart and start their movement toward opposite poles. The Role of Proteolysis in Progression Through Mitosis We have seen how important it is that specific activities take place in their proper order throughout the cell cycle. This orderliness depends to a large degree on the selective destruction of cell cycle regulatory proteins at precise times during the cell cycle. It was pointed out earlier that two distinct multiprotein complexes, SCF and APC, add ubiquitin to proteins at different stages of the cell cycle, targeting them for destruction by a proteasome. The periods during the cell cycle in which the SCF and APC complexes are active are shown in Figure 14.26a. As illustrated in Figure 14.26a, SCF acts primarily during interphase. In contrast, the anaphase promoting com-

plex, or APC, plays a key role in regulating events that occur during mitosis. The APC contains about a dozen core subunits, in addition to an “adaptor protein” that plays a key role in determining which proteins serve as the APC substrate. Two alternate versions of this adaptor protein—Cdc20 and Cdh1—determine substrate selection during mitosis. APC complexes containing one or the other of these adaptors are known as APCCdc20 or APCCdh1 (Figure 14.26a). APCCdc20 becomes activated prior to metaphase (Figure 14.26a) and ubiquitinates a key anaphase inhibitor called securin—so named because it secures the attachment between sister chromatids. The ubiquitination and destruction of securin at the end of metaphase release an active protease called separase. Separase then cleaves the Scc1 subunit of the cohesin molecule that holds sister chromatids together (Figure 14.26b). Cleavage of cohesin triggers the separation of sister chromatids to mark the onset of anaphase. Experimental support for the role of cohesin in maintaining the attachment of sister chromatids comes from studies in which proteolytic enzymes have been injected into cells that had been arrested in metaphase. Cleavage of cohesin by such enzymes leads rapidly to the separation of chromatids and their anaphase-like movement towards the poles. Near the end of mitosis, Cdc20 is inactivated, and the alternate adaptor, Cdh1, takes control of the APC’s substrate selection (Figure 14.26a). When Cdh1 is associated with the APC, the enzyme completes the ubiquitination of cyclin B that was begun by APCCdc20. Destruction of the cyclin leads to a precipitous drop in activity of the mitotic Cdk (cyclin B–Cdk1) and progression of the cell out of mitosis and into the G1 phase of the next cell cycle. The importance of protein degradation in regulating the events of mitosis and the reentry of cells into G1 is best revealed with the use of inhibitors (Figure 14.27). If the destruction of cyclin B is prevented with an inhibitor of the proteasome, cells remain arrested in a late stage of mitosis. If such cells that are arrested in mitosis are subsequently treated with a compound that inhibits the kinase activity of Cdk1, the cell will return to its normal activities and continue through mitosis and cytokinesis. The completion of mitosis clearly requires the cessation of activity of Cdk1 (either by the normal destruction of its cyclin activator or by experimental inhibition). Perhaps the most striking finding of all is obtained when the proteasome-inhibited and Cdk1-inhibited cells that have already exited mitosis are washed free of the Cdk1 inhibitor (Figure 14.27). Such cells, which now contain both cyclin B and active Cdk1, actually progress in the reverse direction and head back into mitosis. This reversal is characterized by compaction of the chromosomes, breakdown of the nuclear envelope, assembly of a mitotic spindle, and movement of the chromosomes back to the metaphase plate, as shown in Figure 14.27. All of these events are triggered by the inappropriate reactivation of Cdk1 activity by removal of its inhibitor. This finding dramatically illustrates the importance of proteolysis in moving the normal cell cycle in a single, irreversible direction. The Events of Anaphase All the chromosomes of the metaphase plate are split in synchrony at the onset of anaphase, and the chromatids (now referred to as chromosomes,

593

SCF/ APC activities

APC

SAC

SCF APCCdc20

Pro Meta Ana Telo

Cell cycle phase

G1

S

G2

Cdh1

Cdh-1

Cdc20

Time Securin

M

Chromosome status (a)

APC

Mitotic cyclins

Cohesin Metaphase (b)

Anaphase

G1

substrates promotes the metaphase– anaphase transition. APCCdh1 is responsible for ubiquitinating proteins, such as mitotic cyclins, that inhibit exit from mitosis. Destruction of these substrates promotes the mitosis–G1 transition. APCCdh1 activity during early G1 helps maintain the low cyclin–Cdk activity (Figure 14.8) required to assemble prereplication complexes at the origins of replication (Figure 13.20). (Although not discussed here, the activation of phosphatases that remove the phosphate groups added by Cdk1 also plays a substantial part in driving the events that occur during the latter stages of mitosis and reentry into G1; see Nature Revs. Mol. Cell Biol. 12:469, 2011.) (A: AFTER J-M PETERS, CURR. OPIN. CELL BIOL. 10:762, 1998; COPYRIGHT 1998. CURRENT OPINION IN CELL BIOLOGY BY ELSEVIER LTD. REPRODUCED WITH PERMISSION OF ELSEVIER LTD. IN THE FORMAT REUSE IN A BOOK/ TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER. SEE ALSO NATURE REVS. MOL. CELL BIOL. 7:650, 2006.)

because they are no longer attached to their sisters) begin their poleward migration (see Figure 14.11). As each chromosome moves during anaphase, its centromere is seen at its leading

edge with the arms of the chromosome trailing behind (Figure 14.28a). The movement of chromosomes toward opposite poles is very slow relative to other types of cellular movements,

Figure 14.27 Experimental demonstration of the importance of proteolysis in a cell’s irreversible exit from mitosis. This illustration shows frames from a video of a cell that had been arrested in mitosis by the presence of a proteasome inhibitor (MG132). At time 0, a Cdk1 inhibitor (flavopiridol) was added to the medium, which caused the cell to complete mitosis and initiate cytokinesis. At 25 min, the cell was washed free of the Cdk1 inhibitor. Because the cell still contained cyclin B (which would normally have been destroyed by proteasomes), the cell reentered mitosis and progressed to metaphase as seen in the

last five frames of the video. The upper row shows phase-contrast images of the cell at various times, and the lower row shows the corresponding fluorescence micrographs with times indicated. Video 3, from which these frames were taken, can be seen at the online version of this paper. Bar in lower right image equals 10 ␮m. (FROM TAMARA A. POTAPOVA, ET AL., NATURE 440:954, 2006. © 2006, REPRINTED WITH PERMISSION FROM MACMILLAN PUBLISHERS LIMITED. COURTESY OF GARY J. GORBSKY.)

14.2 M Phase: Mitosis and Cytokinesis

Figure 14.26 SCF and APC activities during the cell cycle. SCF and APC are multisubunit complexes that ubiquitinate substrates, leading to their destruction by proteasomes. (a) SCF is active primarily during interphase, whereas APC (anaphase promoting complex) is active during mitosis and G1. Two different versions of APC are indicated. These two APCs differ in containing either a Cdc20 or a Cdh1 adaptor protein, which alters the substrates recognized by the APC. APCCdc20 is active early in mitosis, at a time when Cdh1 is inhibited by Cdk1-mediated phosphorylation. As Cdk1 activity drops sharply in late mitosis, Cdh1 is activated, leading to the activation of APCCdh1. The label SAC stands for spindle assembly checkpoint, which is discussed on page 596. The SAC prevents APCCdc20 from triggering anaphase until all the chromosomes are properly aligned at the metaphase plate. (b) APCCdc20 is responsible for destroying proteins, such as securin, that inhibit anaphase. Destruction of these

594

Anaphase begins

(a)

Chapter 14 Cellular Reproduction

Figure 14.28 The mitotic spindle and chromosomes at anaphase. (a) Fluorescence micrograph of a cell at late anaphase, showing the highly compacted arms of the chromosomes at this stage. The arms are seen to trail behind as the kinetochores lead the way toward the respective poles. The chromosomal spindle fibers at this late anaphase stage are extremely short and are no longer evident between the forward edges of the chromosomes and the poles. The polar spindle microtubules, however, are clearly visible in the interzone between the separating chromosomes. Relative movements of the polar microtubules are thought to be responsible for causing the separation of the poles that occurs during anaphase B. (b) Microtubule dynamics during anaphase. Tubulin subunits are lost from both ends of the chromosomal microtubules, resulting in shortening of chromosomal fibers and movement of the chromosomes toward the poles during anaphase A. Meanwhile, tubulin subunits are added to the plus ends of polar

proceeding at approximately 1 ␮m per minute, a value calculated by one mitosis researcher to be equivalent to a trip from North Carolina to Italy that would take approximately 14 million years. The slow rate of chromosome movement ensures that the chromosomes segregate accurately and without entanglement. The forces thought to power chromosome movement during anaphase are discussed in the following section. The poleward movement of chromosomes is accompanied by obvious shortening of chromosomal microtubules. It has long been appreciated that tubulin subunits are lost from the plus (kinetochore-based) ends of chromosomal microtubules during anaphase (Figure 14.28b). Subunits are also lost from the minus ends of these microtubules as a result of the continued poleward flux of tubulin subunits that occurs during prometaphase and metaphase (Figures 14.22 and 14.25). The primary difference in microtubule dynamics between metaphase and anaphase is that subunits are added to the plus ends of microtubules during metaphase, keeping the length of the chromosomal fibers constant (Figure 14.25), whereas subunits are lost from the plus ends during anaphase, resulting in shortening of the chromosomal fibers (Figure 14.28b). This change in behavior at the microtubule plus ends is thought to be triggered by the loss of tension on the kinetochores following separation of the sister chromatids. The movement of the chromosomes toward the poles is referred to as anaphase A to distinguish it from a separate but

(b)

Anaphase continues

microtubules, which also slide across one another, leading to separation of the poles during anaphase B. (A: FROM FELIPE MORABERMÚDAEZ, DANIEL GERLICH, AND JAN ELLENBERG, COVER OF NATURE CELL BIOL. 9, #7, 2007; © 2007, REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

simultaneous movement, called anaphase B, in which the two spindle poles move farther apart. The elongation of the mitotic spindle during anaphase B is accompanied by the net addition of tubulin subunits to the plus ends of the polar microtubules. Thus, subunits can be preferentially added to polar microtubules and removed from chromosomal microtubules at the same time in different regions of the same mitotic spindle (Figure 14.28b). Forces Required for Chromosome Movements at Anaphase In the early 1960s, Shinya Inoué of the Marine Biological Laboratory at Woods Hole proposed that the depolymerization of chromosomal microtubules during anaphase was not simply a consequence of chromosome movement but the cause of it. Inoué suggested that depolymerization of the microtubules that comprise a spindle fiber could generate sufficient mechanical force to pull a chromosome forward. Early experimental support for the disassembly-force model came from studies in which chromosomes underwent considerable movement as the result of the depolymerization of attached microtubules. An example of one of these experiments is shown in Figure 14.29. In this case, the movement of a microtubule-bound chromosome (arrow) occurs in vitro following dilution of the medium. Dilution reduces the concentration of soluble tubulin, which in turn promotes the depolymerization of the microtubules. These types of experi-

595

a

b

0

51

c

75

Figure 14.29 Experimental demonstration that microtubule depolymerization can move attached chromosomes in vitro. The structure at the lower left is the remnant of a lysed protozoan. In the presence of tubulin, the basal bodies at the surface of the protozoan were used as sites for the initiation of microtubules, which grew outward into the medium. Once the microtubules had formed, condensed mitotic chromosomes were introduced into the chamber and allowed to bind to the ends of the microtubules. The arrow shows a

chromosome attached to the end of a bundle of microtubules. The concentration of soluble tubulin within the chamber was then decreased by dilution, causing the depolymerization of the microtubules. As shown in this video sequence, the shrinkage of the microtubules was accompanied by the movement of the attached chromosome. Bar equals 5 ␮m. (FROM MARTINE COUE, VIVIAN A. LOMBILLO, AND J. RICHARD MCINTOSH, J. CELL BIOL. 112:1169, 1991, FIG. 3. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

ments, as well as direct force-measuring studies, indicate that microtubule depolymerization alone can generate sufficient force to pull chromosomes through a cell. Figure 14.30a depicts a model of the events proposed to occur during chromosome movement at anaphase in a vertebrate cell. As indicated in Figure 14.28b, the microtubules that comprise the chromosomal spindle fibers undergo depolymerization at both their minus and plus ends during anaphase. These combined activities lead to the movement of chromosomes toward the pole. Depolymerization at the microtubule minus ends serves to transport the chromosomes

toward the poles due to poleward flux, reminiscent of a person standing on a “moving walkway” in an airport. In contrast, depolymerization at the microtubule plus ends serves to “chew up” the fiber that is towing the chromosomes. Some cells rely more on poleward flux, others more on plus-end depolymerization. Studies of animal cells in anaphase have revealed that both the plus and minus ends of chromosomal fibers are sites where depolymerizing kinesins (members of the kinesin-13 family, page 336) are localized. These depolymerases are indicated at the opposite ends of the microtubule in Figure 14.30a. If either of these microtubule “depolymerases” is specifically

Spindle pole

Ndc80 complex Microtubule of chromosomal fiber

ⴙ

Poleward microtubule flux

ⴚ

Depolymerase

Outer plate of kinetochore

Figure 14.30 Proposed mechanism for the movement of chromosomes during anaphase in animal cells. In the model depicted here, chromosome movement toward the poles is accomplished by a combination of poleward flux, which moves the body of each microtubule toward one of the poles, and simultaneous depolymerization of the microtubule at both ends. Depolymerizing kinesins of the kinesin-13 family have been localized at both the plus (kinetochore) and minus (polar) ends of chromosomal microtubules and are postulated to be responsible for depolymerization at their respective sites. In this model, the Ndc80 protein complex of the outer

kinetochore plate acts as the device that couples microtubule depolymerization to chromosome segregation. The force required for chromosome movement is provided by the release of strain energy as the microtubule depolymerizes. The released energy is utilized by the curled ends of the depolymerizing protofilaments to bias the movement of the bound heads of the Ndc80 complex (dashed black arrow) toward the minus end of the microtubule. Motor proteins of the kinetochore, such as cytoplasmic dynein, may also have a force-generating role in chromosome movement during anaphase.

14.2 M Phase: Mitosis and Cytokinesis

Depolymerase

596

Chapter 14 Cellular Reproduction

inhibited, chromosome segregation during anaphase is at least partially disrupted. These findings suggest that ATP-dependent, kinesin-mediated depolymerization forms the basis for chromosome segregation during mitosis. It was mentioned on page 585 that one of the questions of greatest interest in the field of mitosis is the mechanism by which kinetochores are able to hold on to the plus ends of microtubules that are losing tubulin subunits. The Ndc80 complexes of the kinetochore are present as molecular fibrils that reach out to form relatively weak linkages with an attached microtubule just behind its plus end. It is estimated that each microtubule is contacted around its circumference by 6 to 9 of these Ndc80 tethers. A number of studies suggest that, as indicated in Figure 14.30, the terminal heads of the Ndc80 complexes are able to travel along the microtubule towards its minus end, pushed along by the curling protofilaments of the disassembling tip. As a result, the attached chromosome moves toward the spindle pole as it is towed by the shrinking chromosomal fiber. The Spindle Assembly Checkpoint As discussed on page 579, cells possess checkpoint mechanisms that monitor the status of events during the cell cycle. One of these checkpoints operates at the transition between metaphase and anaphase. The spindle assembly checkpoint (SAC), as it is called, is best revealed when a chromosome fails to become aligned properly at the metaphase plate. When this happens, the checkpoint mechanism delays the onset of anaphase until the misplaced chromosome has assumed its proper position along the spindle equator. If a cell were not able to postpone chromosome segregation, it would greatly elevate the risk of the daughter cells receiving an abnormal number of chromosomes (aneuploidy). This expectation has been confirmed with the identification of a number of children with inherited deficiencies in one of the spindle checkpoint proteins. These individuals exhibit a disorder (named MVA), which is characterized by a high percentage of aneuploid cells and a greatly increased risk of developing cancer. How does the cell determine whether or not all of the chromosomes are properly aligned at the metaphase plate? Let’s consider a chromosome that is only attached to microtubules from one spindle pole, which is probably the circumstance of the wayward chromosome indicated by the arrow in Figure 14.23a. Unattached kinetochores contain a complex of proteins, the best studied of which is called Mad2, that mediates the spindle assembly checkpoint. The presence of these proteins at an unattached kinetochore sends a “wait” signal to the cell cycle machinery that prevents the cell from continuing on into anaphase. Once the wayward chromosome becomes attached to spindle fibers from both spindle poles and becomes properly aligned at the metaphase plate, the signaling complex leaves the kinetochore, which turns off the “wait” signal and allows the cell to progress into anaphase. Figure 14.31 shows the mitotic spindle of a cell that is arrested prior to metaphase due to a single unaligned chromosome. Unlike all of the other kinetochores in this cell, only the unaligned chromosome is seen to still contain the Mad2 protein. As long as the cell contains unaligned chromosomes,

Figure 14.31 The spindle assembly checkpoint. Fluorescence micrograph of a mammalian cell in late prometaphase labeled with antibodies against the spindle checkpoint protein Mad2 (pink) and tubulin of the microtubules (green). The chromosomes appear blue. Only one of the chromosomes of this cell is seen to contain Mad2, and this chromosome has not yet become aligned at the metaphase plate. The presence of Mad2 on the kinetochore of this chromosome is sufficient to prevent the initiation of anaphase. (FROM JENNIFER WATERS, REY-HUEI CHEN, ANDREW W. MURRAY, AND E. D. SALMON, J. CELL BIOL. 141, COVER #5, 1998; REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

Mad2 molecules are able to inhibit cell cycle progress. According to a favored model, inhibition is achieved through direct interaction between Mad2 and the APC activator Cdc20. During the period that Cdc20 is bound to Mad2, APC complexes would be unable to ubiquitinate the anaphase inhibitor securin, thus keeping all of the sister chromatids attached to one another by their cohesin “glue.” It is well established that the spindle assembly checkpoint is activated by the presence of an unattached kinetochore, but there are other chromosomal abnormalities that arise during the progression to metaphase that also require corrective measures. For example, on occasion, the two kinetochores of sister chromatids will become attached to microtubules from the same spindle pole, a condition referred to as a syntelic attachment (step 3a, Figure 14.20b). If not corrected, a syntelic attachment is very likely to lead to the movement of both sister chromatids to one of the daughter cells, leaving the other daughter devoid of this chromosome. The cell is likely alerted to the presence of a chromosome with a syntelic attachment by the lack of tension on the chromosome’s kinetochores. Tension normally develops when sister chromatids are being pulled by microtubules from opposite spindle poles (steps 3–4, Figure 14.20b). Cells are able to correct syntelic attachments (and other types of abnormal microtubule connections) through the action of an enzyme called Aurora B kinase, which is part of a mobile protein complex that resides at the centromere during prometaphase and metaphase. Among the substrates of Aurora B kinase are several of the proteins thought to be involved in kinetochore–microtubule attachment, including members of the Ndc80 complex and the kinesin depolymerase of Figure

597

14.30. Studies suggest that Aurora B kinase molecules of an incorrectly attached chromosome phosphorylate these protein substrates, which destabilizes microtubule attachment to both kinetochores. Once freed of their bonds, the kinetochores of each sister chromatid have a fresh opportunity to become attached to microtubules from opposite spindle poles. Inhibition of Aurora B kinase in cells or extracts leads to misalignment and missegregation of chromosomes (see Figure 18.51).

Telophase As the chromosomes near their respective poles, they tend to collect in a mass, which marks the beginning of the final stage of mitosis, or telophase (Figures 14.11 and 14.32). During telophase, daughter cells return to the interphase condition: the mitotic spindle disassembles, the nuclear envelope reforms, and the chromosomes become more and more dispersed until they disappear from view under the microscope. The actual partitioning of the cytoplasm into two daughter cells occurs by a process to be discussed shortly. First, however,

let’s look back and consider the motor proteins involved in several of the major chromosome movements that take place during mitosis.

Motor Proteins Required for Mitotic Movements Mitosis is characterized by extensive movements of cellular structures. Prophase is accompanied by movement of the spindle poles to opposite ends of the cell, prometaphase by movement of the chromosomes to the spindle equator, anaphase A by movement of the chromosomes from the spindle equator to its poles, and anaphase B by the elongation of the spindle. Over the past decade, a variety of different molecular motors have been identified in different locations in mitotic cells of widely diverse species. The motors involved in mitotic movements are primarily microtubule motors, including a number of different kinesin-related proteins and cytoplasmic dynein. Some of the motors move toward the plus end of the microtubule, others toward the minus end. As discussed above, one group of kinesins does not move anywhere, but promotes microtubule depolymerization. Motor proteins have been localized at the spindle poles, along the spindle fibers, and within both the kinetochores and arms of the chromosomes. Although firm conclusions about the functions of specific motor proteins cannot yet be drawn, a general picture of the roles of these molecules is suggested (Figure 14.33): ■

■

Furrow (Cytokinesis) Nuclear envelope reforming

■

Motor proteins located along the polar microtubules probably contribute by keeping the poles apart (Figure 14.33a,b). Motor proteins residing on the chromosomes are probably important in the movements of the chromosomes during prometaphase (Figure 14.33a), in maintaining the chromosomes at the metaphase plate (Figure 14.33b), and in separating the chromosomes during anaphase (Figure 14.33c). Motor proteins situated along the overlapping polar microtubules in the region of the spindle equator are probably responsible for cross-linking antiparallel microtubules and sliding them over one another, thus elongating the spindle during anaphase B (Figure 14.33c).

Cytokinesis 14.2 M Phase: Mitosis and Cytokinesis

Figure 14.32 Telophase. Electron micrograph of a section through a human lymphocyte in telophase. (DAVID M. PHILLIPS / PHOTO RESEARCHERS, INC.)

Mitosis accomplishes the segregation of duplicated chromosomes into daughter nuclei, but the cell is divided into two daughter cells by a separate process called cytokinesis. The first hint of cytokinesis in most animal cells appears during anaphase as an indentation of the cell surface in a narrow band around the cell. As time progresses, the indentation deepens to form a furrow that moves inward toward the center of the cell. The plane of the furrow lies in the same plane previously occupied by the chromosomes of the metaphase plate, so that the two sets of chromosomes are ultimately partitioned into different cells (as in Figure 14.32). As one cell becomes two cells, additional plasma membrane is delivered to the cell surface via cytoplasmic vesicles that fuse with the advancing

598 Figure 14.33 Proposed activity of motor proteins during mitosis. (a) Prometaphase. The two halves of the mitotic spindle are moving apart from one another to opposite poles, due to the action of bipolar (4-headed) plus-end-directed motors (members of the kinesin-5 family). These motors can bind by their heads to antiparallel microtubules from opposite poles and cause them to slide apart (step 1). (Additional motors associated with the centrosomes and cortex are not shown.) Meanwhile, the chromosomes have become attached to the chromosomal microtubules and can be seen oscillating back and forth along the microtubules. Ultimately, the chromosomes are moved to the center of the spindle, midway between the poles. Poleward chromosome movements are mediated by minus-end-directed motors (i.e., cytoplasmic dynein) residing at the kinetochore (step 2). Chromosome movements away from the poles are mediated by plus-end-directed motors (i.e., kinesins) residing at the kinetochore and especially along the chromosome arms (step 3) (see Figure 14.21). (b) Metaphase. The two halves of the spindle maintain their separation as the result of continued plus-end-directed motor activity associated with the polar microtubules (step 4). The motor activity associated with step 4 is also suspected of promoting the flux of subunits within these microtubules that is depicted in Figure 14.25. The chromosomes are thought to be maintained at the equatorial plane by the balanced activity of plus end-and minus end-directed motor proteins residing at the kinetochore (step 5). (c) Anaphase. The movement of the chromosomes toward the poles is thought to require the activity of kinesin depolymerases that catalyze depolymerization at both the plus and minus ends of microtubules (step 6). The separation of the poles (anaphase B) is thought to result from the continuing activity of the bipolar plus-enddirected motors of the polar microtubules (step 7). (K. E. SAWIN AND J. M.

+ +

+

– ––

–

–

––

–

+

– – + +

+

+ 1

(a) + +

+ +

+ ––

5

– –– – –

– – – – – – – +

+

+ +

+

4

(b)

7

+

SCHOLEY, TRENDS CELL BIOL. 1:122, 1991. TRENDS IN CELL BIOLOGY BY ELSEVIER LTD. REPRODUCED WITH PERMISSION OF ELSEVIER LTD. IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

+ +

+

+

+ –

cleavage furrow. In its latter stages, the advancing furrow passes through the tightly packed remnants of the central portion of the mitotic spindle, which forms a cytoplasmic bridge between the daughter cells called the midbody (Figure 14.34a). The final step of cytokinesis is called abscission, when the surfaces of the cleavage furrow fuse with one another, splitting

3

+ + + +

2

– +

– ––

–

–

6

– –

–– – –

–

+

6

+

+

+ +

(c)

Chapter 14 Cellular Reproduction

Midbody

(a)

Figure 14.34 Cytokinesis (a) These cultured mammalian cells are undergoing the final step in cytokinesis, called abscission, in which the cleavage furrow cuts through the midbody, a thin cytoplasmic bridge that is packed with remnants of the central portion of the mitotic spindle. Microtubules are green, actin is red, and DNA is blue.

(b)

(b) This fertilized sea urchin egg has just been split into two cells by cytokinesis. (FROM AHNA R. SKOP ET AL., SCIENCE 305:61, 2004, FIG. 1A. REPRINTED WITH PERMISSION FROM AAAS. B: COURTESY OF TRYGGVE GUSTAFSON.)

599 Actin filament

Bipolar myosin filament

(b)

Contractile ring

Cleavage furrow

Daughter cells (a)

as evidenced by their binding of anti-myosin II antibodies (Figure 14.36b). The importance of myosin II in cytokinesis is evident from the fact that (1) anti-myosin II antibodies cause the rapid cessation of cytokinesis when injected into a dividing cell (Figure 14.36c), and (2) cells lacking a functional myosin II gene carry out nuclear division by mitosis but cannot divide normally into daughter cells. The assembly of the actin-myosin contractile machinery in the plane of the future cleavage furrow is orchestrated by a G protein called RhoA. In its GTP-bound state, RhoA triggers a cascade of events that leads to both the assembly of actin filaments and the activation of myosin II’s motor activity. If RhoA is depleted from cells or inactivated, a cleavage furrow fails to develop. The force-generating mechanism operating during cytokinesis is thought to be similar to the actin and myosinbased contraction of muscle cells. In fact, the cytokinesis furrow of a primitive single-celled eukaryote is the likely evolutionary ancestor of the contractile machinery of animal muscle cells. Whereas the sliding of actin filaments of a muscle cell brings about the shortening of the muscle fiber, sliding of the filaments of the contractile ring pulls the cortex and attached plasma membrane toward the center of the cell. As a result, the contractile ring constricts the equatorial region of the cell, much like pulling on a purse string narrows the opening of a purse. It is generally agreed that the position of the cleavage furrow is determined by the position of the anaphase mitotic spindle, which induces the activation of RhoA in a narrow ring within the cortex. However, there has been considerable

14.2 M Phase: Mitosis and Cytokinesis

the cell in two (Figure 14.34b). Abscission requires the action of ESCRT complexes—the same proteins responsible for severing the intraluminal vesicles that form within endosomes (page 312). Our present concept of the mechanism responsible for cytokinesis stems from a proposal made by Douglas Marsland in the 1950s known as the contractile ring theory (Figure 14.35a). Marsland proposed that the force required to cleave a cell is generated in a thin band of contractile cytoplasm located in the cortex, just beneath the plasma membrane of the furrow. Microscopic examination of the cortex beneath the furrow of a cleaving cell reveals the presence of large numbers of actin filaments (Figure 14.35b and 14.36a). These unbranched actin filaments are assembled in the cell cortex by the action of a formin protein (page 372), which may also anchor the filaments to the overlying plasma membrane. Interspersed among the actin filaments are short, bipolar myosin filaments. These filaments are composed of myosin II,

Figure 14.35 The formation and operation of the contractile ring during cytokinesis. (a) Actin filaments become assembled in a ring at the cell equator. Contraction of the ring, which requires the action of myosin, causes the formation of a furrow that splits the cell in two. (b) Confocal fluorescence micrograph of a fly spermatocyte undergoing cytokinesis at the end of the first meiotic division. Actin filaments, which have been stained with the mushroom toxin phalloidin, are seen to be concentrated in a circular equatorial band within the cleavage furrow: (B: FROM DANIEL SAUL, ET AL., J. CELL SCIENCE 117:3893, 2004, #17 COVER , COURTESY OF JULIE A. BRILL, BY PERMISSION OF THE COMPANY OF BIOLOGISTS LTD. http://jcs.biologists.org/content/117/17.cover-expansion)

600

Furrow Furrow (a)

(b)

(a)

(b)

(c)

Chapter 14 Cellular Reproduction

Figure 14.36 Experimental demonstration of the importance of myosin in cytokinesis. (a,b) Localization of actin and myosin II in a Dictyostelium ameba during cytokinesis as demonstrated by doublestain immunofluorescence. (a) Actin filaments (red) are located at both the cleavage furrow and the cell periphery where they play a key role in cell movement (Section 9.7). (b) Myosin II (green) is localized at the cleavage furrow, where it is part of a contractile ring that encompasses the equator. (c) A starfish egg that had been microinjected with an antibody against starfish myosin, as observed under polarized light (which causes the mitotic spindles to appear either brighter or darker than the background due to the presence of oriented microtubules). While cytokinesis has been completely suppressed by the antibodies, mitosis (as revealed by the mitotic spindles) continues unaffected. (A,B: COURTESY OF YOSHIO F UKUI; C: FROM DANIEL P. KIEHART, ISSEI MABUCHI, AND SHINYA INOUÉ, J. CELL BIOL. 94:167, 1982, FIG. 1. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

debate as to how this particular zone within the cortex is selected. Early pioneering studies on marine invertebrate eggs by Ray Rappaport of Union College in New York demonstrated that the contractile ring forms in a plane midway between the spindle poles, even if one of the poles is displaced by a microneedle inserted into the cell. An example of the relationship between the position of the spindle poles and cleavage plane is shown in the experiment of Figure 14.37. These studies suggest that the site of actin-filament assembly, and

(c)

Figure 14.37 The site of formation of the cleavage plane and the time at which cleavage occurs depends on the position of the mitotic spindle. (a) This echinoderm egg was allowed to divide once to form a two-cell embryo. Then, once the mitotic spindle appeared in each of the two cells, one of the cells was drawn into a micropipette, causing it to assume a cylindrical shape. The two dark spots in each cell are the spindle poles that have formed prior to the second division of each cell. (b) Nine minutes later the cylindrical cell has completed cleavage, while the spherical cell has not yet begun to cleave. These photos indicate that (1) the cleavage plane forms between the spindle poles, regardless of their position, and (2) cleavage occurs more rapidly in the cylindrical cell. Bar equals 80 ␮m. (c) These results can be explained by assuming that (1) the cleavage plane (brown bar) forms where the astral microtubules overlap, and (2) cleavage occurs earlier in the cylindrical cell because the distance from the poles (blue spheres) to the site of cleavage is reduced, thereby shortening the time it takes the cleavage signal to reach the surface. (A,B: FROM CHARLES B. SHUSTER AND DAVID R. BURGESS, J. CELL BIOL. 146:987, 1999, FIG. 5. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

thus the plane of cytokinesis, is determined by a signal emanating from the spindle poles. The signal is thought to travel from the spindle poles to the cell cortex along the astral

601

microtubules. When the distance between the poles and the cortex is modified experimentally, the timing of cytokinesis can be dramatically altered (Figure 14.37). In contrast, researchers conducting studies on smaller mammalian cells have found evidence that the site of cleavage furrow formation is defined by a signal that originates in the central part of the mitotic spindle rather than its poles. Researchers have struggled to reconcile these opposing findings. The challenge is made greater by the fact that, unlike mitosis, cytokinesis has not been reconstituted in egg extracts, where it would be easiest to study. The simplest explanations are that (1) different cell types utilize different mechanisms or (2) both mechanisms operate in the same cell. In fact, recent studies provide evidence for this latter possibility. Cytokinesis in Plant Cells: Formation of the Cell Plate Plant cells, which are enclosed by a relatively inextensible cell wall, undergo cytokinesis by a very different mechanism. Unlike animal cells, which are constricted by a furrow that advances inward from the outer cell surface, plant cells must construct an extracellular wall inside a living cell. Wall formation starts in the center of the cell and grows outward to meet the existing lateral walls. The formation of a new cell wall begins with the construction of a simpler precursor, which is called the cell plate.

The plane in which the cell plate forms is perpendicular to the axis of the mitotic spindle but, unlike the case for animal cells, the plane is not determined by the position of the spindle nor is it determined late in mitosis. Rather, the orientation of both the mitotic spindle and cell plate are determined by a belt of cortical microtubules—the preprophase band—that forms in late G2 (see Figure 9.21). Even though the preprophase band has disassembled by prometaphase, it leaves an invisible imprint that determines the future division site. The first sign of cell plate formation is seen in late anaphase with the appearance of the phragmoplast in the center of the dividing cell. The phragmoplast consists of clusters of interdigitating microtubules oriented perpendicular to the future plate (Figure 9.21), together with actin filaments, membranous vesicles, and electron-dense material. The microtubules of the phragmoplast, which arise from remnants of the mitotic spindle, serve as tracks for the movement of small Golgi-derived secretory vesicles into the region. The vesicles become aligned along a plane between the daughter nuclei (Figure 14.38a). Electron micrographs of rapidly frozen tobacco cells have revealed the steps by which the Golgi-derived vesicles become reorganized into the cell plate. To begin the process (step 1, Figure 14.38b), the vesicles send out fingerlike tubules that contact and fuse with neighboring vesicles to

+

Microtubule

_

Vesicle

1

2

Plasma membrane

Figure 14.38 The formation of a cell plate between two daughter plant nuclei during cytokinesis. (a) A low-magnification electron micrograph showing the formation of the cell plate between the future daughter cells. Secretory vesicles derived from nearby Golgi complexes have become aligned along the equatorial plane (arrow) and are beginning to fuse with one another. The membrane of the vesicles forms the plasma membranes of the two daughter cells, and the contents of the vesicles will provide the material that forms the cell plate separating the cells. (b) Steps in the formation of the cell plate as described in the text. (A: DAVID PHILLIPS/PHOTO RESEARCHERS, INC.; L. A. STAEHELIN, JOURNAL CELL BIOLOGY 130:1354, 1995. THE JOURNAL OF CELL BIOLOGY BY ROCKEFELLER INSTITUTE; AMERICAN SOCIETY FOR CELL BIOLOGY COPYRIGHT 1995 REPRODUCED WITH PERMISSION OF ROCKEFELLER UNIVERSITY PRESS IN THE FORMAT REPUBLISH IN A TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

Parent cell wall

B: A. L. SAMUELS, T. H. GIDDINGS, JR. &

3

(b)

14.2 M Phase: Mitosis and Cytokinesis

(a)

602

form an interwoven tubular network in the center of the cell (step 2). Additional vesicles are then directed along microtubules to the lateral edges of the network. The newly arrived vesicles continue the process of tubule formation and fusion, which extends the network in an outward direction (step 2). Eventually, the leading edge of the growing network contacts the parent plasma membrane at the boundary of the cell (step 3). Ultimately, the tubular network loses its cytoplasmic gaps and matures into a continuous, flattened partition. The membranes of the tubular network become the plasma membranes of the two adjacent daughter cells, whereas the secretory products that had been carried within the vesicles contribute to the intervening cell plate. Once the cell plate is completed, cellulose and other materials are added to produce the mature cell wall.

Interphase

Prophase l

Metaphase l

REVIEW 1. How do the events of mitotic prophase prepare the chromatids for later separation at anaphase? 2. What are some of the activities of the kinetochore during mitosis? 3. Describe the events that occur in a cell during prometaphase and during anaphase. 4. Describe the similarities and differences in microtubule dynamics between metaphase and anaphase. How are the differences related to anaphase A and B movements? 5. What types of force-generating mechanisms might be responsible for chromosome movement during anaphase? 6. Contrast the events that occur during cytokinesis in typical plant and animal cells.

Anaphase l

Telophase l

Prophase ll

Metaphase ll

Chapter 14 Cellular Reproduction

14.3 | Meiosis The production of offspring by sexual reproduction includes the union of two cells, each with a haploid set of chromosomes. As discussed in Chapter 10, the doubling of the chromosome number at fertilization is compensated by an equivalent reduction in chromosome number at a stage prior to formation of the gametes. This is accomplished by meiosis, a term coined in 1905 from the Greek word meaning “reduction.” Meiosis ensures production of a haploid phase in the life cycle, and fertilization ensures a diploid phase. Without meiosis, the chromosome number would double with each generation, and sexual reproduction would not be possible. To compare the events of mitosis and meiosis, we need to examine the fate of the chromatids. Prior to both mitosis and meiosis, diploid G2 cells contain pairs of homologous chromosomes, with each chromosome consisting of two chromatids. During mitosis, the chromatids of each chromosome are split apart and separate into two daughter nuclei in a single division. As a result, cells produced by mitosis contain pairs

Anaphase ll

Telophase ll

4 haploid cells

Figure 14.39 The stages of meiosis.

of homologous chromosomes and are genetically identical to their parents. During meiosis, in contrast, the four chromatids of a pair of replicated homologous chromosomes are distributed among four daughter nuclei. Meiosis accomplishes this

603

feat by incorporating two sequential divisions without an intervening round of DNA replication (Figure 14.39). In the first meiotic division, each chromosome (consisting of two chromatids) is separated from its homologue. As a result, each daughter cell contains only one member of each pair of homologous chromosomes. For this to occur, homologous chromosomes are paired during prophase of the first meiotic division (prophase I, Figure 14.39) by an elaborate process that has no counterpart in mitosis. As they are paired, homologous chromosomes engage in a process of genetic recombination that produces chromosomes with new combinations of

Gametic or terminal

Sporic or intermediate

Zygotic or initial

Gametes (n) Zygote (2n)

Meiosis

Diploid generation (2n)

Animal (2n)

Sporophyte (2n)

Meiosis

Spores (n)

Haploid generation (n)

(n)

Gameto– phyte (n)

Meiosis

Gametes (n)

Figure 14.40 A comparison of three major groups of organisms based on the stage within the life cycle at which meiosis occurs and the duration of the haploid phase. (THE CELL IN DEVELOPMENT AND

The Stages of Meiosis As with mitosis, the prelude to meiosis includes DNA replication. The premeiotic S phase generally takes several times longer than a premitotic S phase. Prophase of the first meiotic

14.3 Meiosis

HEREDITY BY WILSON, EDMUND B. COPYRIGHT 1987. REPRODUCED WITH PERMISSION OF TAYLOR & FRANCIS GROUP LLC - BOOKS IN THE FORMAT TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

maternal and paternal alleles (see metaphase I, Figure 14.39). In the second meiotic division, the two chromatids of each chromosome are separated from one another (anaphase II, Figure 14.39). A survey of various eukaryotes reveals marked differences with respect to the stage within the life cycle at which meiosis occurs and the duration of the haploid phase. The following three groups (Figure 14.40) can be identified on these bases: 1. Gametic or terminal meiosis. In this group, which includes all multicellular animals and many protists, the meiotic divisions are closely linked to the formation of the gametes (Figure 14.40, left). In male vertebrates (Figure 14.41a), for example, meiosis occurs just prior to the differentiation of the spermatozoa. Spermatogonia that are committed to undergo meiosis become primary spermatocytes, which then undergo the two divisions of meiosis to produce four relatively undifferentiated spermatids. Each spermatid undergoes a complex differentiation to become the highly specialized sperm cell (spermatozoon). In female vertebrates (Figure 14.41b), oogonia become primary oocytes, which then enter a greatly extended meiotic prophase. During this prophase, the primary oocyte grows and becomes filled with yolk and other materials. It is only after differentiation of the oocyte is complete (i.e., the oocyte has reached essentially the same state as when it is fertilized) that the meiotic divisions occur. Vertebrate eggs are typically fertilized at a stage before the completion of meiosis (usually at metaphase II). Meiosis is completed after fertilization, while the sperm resides in the egg cytoplasm. 2. Zygotic or initial meiosis. In this group, which includes only protists and fungi, the meiotic divisions occur just after fertilization (Figure 14.40, right) to produce haploid spores. The spores divide by mitosis to produce a haploid adult generation. Consequently, the diploid stage of the life cycle is restricted to a brief period after fertilization when the individual is still a zygote. 3. Sporic or intermediate meiosis. In this group, which includes plants and some algae, the meiotic divisions take place at a stage unrelated to either gamete formation or fertilization (Figure 14.40, center). If we begin the life cycle with the union of a male gamete (the pollen grain) and a female gamete (the egg), the diploid zygote undergoes mitosis and develops into a diploid sporophyte. At some stage in the development of the sporophyte, sporogenesis (which includes meiosis) occurs, producing spores that germinate directly into a haploid gametophyte. The gametophyte can be either an independent stage or, as in the case of seed plants, a tiny structure retained within the ovules. In either case, the gametes are produced from the haploid gametophyte by mitosis.

604

Spermatogonia Oogonia

Mitoses

Growth and differentiation

Growth

Primary oocyte

Primary spermatocyte

Secondary oocyte

Spermatids Differentiation Four sperm cells

Meiosis

Secondary spermatocytes Oogenesis

Spermatogenesis

Meiosis

Meiotic divisions

Mitoses

Polar Body

Fertilization

Egg

Polar Body

Chapter 14 Cellular Reproduction

(a)

(b)

Figure 14.41 The stages of gametogenesis in vertebrates: a comparison between the formation of sperm and eggs. In both sexes, a relatively small population of primordial germ cells present in the embryo proliferates by mitosis to form a population of gonial cells (spermatogonia or oogonia) from which the gametes differentiate. In

the male (a), meiosis occurs before differentiation, whereas in the female (b), both meiotic divisions occur after differentiation. Each primary spermatocyte generally gives rise to four viable gametes, whereas each primary oocyte forms only one fertilizable egg and two or three polar bodies.

division (i.e., prophase I) is typically lengthened in extraordinary fashion when compared to mitotic prophase. In the human female, for example, oocytes initiate prophase I of meiosis prior to birth and then enter a period of prolonged arrest. Oocytes resume meiosis just prior to the time they are ovulated, which occurs every 28 days or so after an individual reaches puberty. Consequently, many human oocytes remain arrested in the same approximate stage of prophase for several decades. The first meiotic prophase is also very complex and is customarily divided into several stages that are similar in all sexually reproducing eukaryotes (Figure 14.42). The first stage of prophase I is leptotene, during which the chromosomes become compacted and visible in the light microscope. Although the chromosomes have replicated at an earlier stage, there is no indication that each chromosome is actually composed of a pair of identical chromatids. In the electron microscope, however, the chromosomes are revealed to be composed of paired chromatids. The second stage of prophase I, which is called zygotene, is marked by the visible association of homologues with one another. This process of chromosome pairing is called synapsis and is an intriguing event with important unanswered questions: On what basis do the homologues recognize one

another? How does the pair become so perfectly aligned? When does recognition between homologues first occur? Recent studies have shed considerable light on these questions. It had been assumed for years that interaction between homologous chromosomes first begins as chromosomes initiate synapsis. However, studies on yeast cells by Nancy Kleckner and her colleagues at Harvard University demonstrated that homologous regions of DNA from homologous chromosomes are already associated with one another during leptotene. Chromosome compaction and synapsis during zygotene simply make this arrangement visible under the microscope. As will be discussed below, the first step in genetic recombination is the deliberate introduction of double-stranded breaks in aligned DNA molecules. Studies in both yeast and mice suggest the DNA breaks occur in leptotene, well before the chromosomes are visibly paired. These findings are supported by studies aimed at locating particular DNA sequences within the nuclei of premeiotic and meiotic cells. We saw on page 510 that individual chromosomes occupy discrete regions within nuclei rather than being randomly dispersed throughout the nuclear space. When yeast cells about to enter meiotic prophase are examined, each pair of homologous chromosomes is found to share a joint territory

605 Centromere

Tetrad

Synaptonemal complex

Leptotene

Zygotene

Pachytene

Diplotene

Diakinesis

Metaphase 1

Chiasma

Figure 14.42 The stages of prophase I. The events at each stage are described in the text.

distinct from the territories shared by other pairs of homologues. This finding suggests that homologous chromosomes are paired to some extent before meiotic prophase begins. The

3 4

5

6 X

7 8

14.3 Meiosis

Figure 14.43 Association of the telomeres of meiotic chromosomes with the nuclear envelope. The chromosomes of a stage of meiotic prophase of the male grasshopper in which homologous chromosomes are physically associated as part of a bivalent. The bivalents are arranged in a well-defined “bouquet” with their terminal regions clustered near the inner surface of the nuclear envelope, at the base of the photo. (FROM B. JOHN, MEIOSIS, © 1990. CAMBRIDGE UNIVERSITY PRESS. REPRINTED WITH PERMISSION. COURTESY OF BERNARD JOHN.)

telomeres (terminal segments) of leptotene chromosomes are distributed throughout the nucleus. Then, near the end of leptotene, there is a dramatic reorganization of chromosomes in many species so that the telomeres become localized at the inner surface of the nuclear envelope at one side of the nucleus. Clustering of telomeres at one end of the nuclear envelope occurs in a wide variety of eukaryotes and causes the chromosomes to resemble the clustered stems of a bouquet of flowers (Figure 14.43). Mice carrying mutations that prevent the association of chromosomes with the nuclear envelope exhibit defects in synapsis, genetic recombination, and gamete formation. These experimental results suggest that the nuclear envelope plays an important role in the interaction between homologous chromosomes during meiosis. Electron micrographs indicate that chromosome synapsis is accompanied by the formation of a complex structure called the synaptonemal complex. The synaptonemal complex (SC) is a ladder-like structure with transverse protein filaments connecting the two lateral elements (Figure 14.44). The chromatin of each homologue is organized into loops that extend from one of the lateral elements of the SC (Figure 14.44b). The lateral elements are composed primarily of cohesin (page 584), which presumably binds together the chromatin of the sister chromatids. For many years, the SC was thought to hold each pair of homologous chromosomes in the proper position to initiate genetic recombination between strands of homologous DNA. It is now evident that the SC is not required for genetic recombination. Not only does the SC form after genetic recombination has been initiated, but mutant yeast cells unable to assemble an SC can still engage in the exchange of genetic information between homologues. It is currently thought that the SC functions primarily as a scaffold to allow interacting chromatids to complete their crossover activities, as described below.

606

K

(a) DNA of sister chromatids of one homologous chromosome

electron-dense bodies about 100 nm in diameter are seen within the center of the SC. These structures have been named recombination nodules because they correspond to the sites where crossing-over is taking place, as evidenced by the associated synthesis of DNA that occurs during intermediate steps of recombination (page 610). Recombination nodules contain the enzymatic machinery that facilitates genetic recombination, which is completed by the end of pachytene. The beginning of diplotene, the next stage of meiotic prophase I (Figure 14.42), is recognized by the dissolution of the SC, which leaves the chromosomes attached to one another at specific points by X-shaped structures, termed chiasmata (singular chiasma) (Figure 14.45). Chiasmata are located at sites on the chromosomes where crossing-over between DNA molecules from the two chromosomes had previously occurred. Chiasmata are formed by covalent junctions between a chromatid from one homologue and a nonsister chromatid from the other homologue. These points of attach-

Crossover sites (chiasmata)

Lateral element (contains cohesin) (b)

Crossover Recombination nodule

Chapter 14 Cellular Reproduction

Figure 14.44 The synaptonemal complex. (a) Electron micrograph of a human pachytene bivalent showing a pair of homologous chromosomes held in a tightly ordered parallel array. K, kinetochore. (b) Schematic diagram of the synaptonemal complex and its associated chromosomal fibers. The dense granules (recombination nodules) seen in the center of the SC (indicated by the arrowhead in part a) contain the enzymatic machinery required to complete genetic recombination, which is thought to begin at a much earlier stage in prophase I. Closely paired loops of DNA from the two sister chromatids of each chromosome are depicted. The loops are likely maintained in a paired configuration by cohesin (not shown). Genetic recombination (crossing-over) is presumed to occur between the DNA loops from nonsister chromatids, as shown. (A: COURTESY OF ALBERTO J. SOLARI, CHROMOSOMA 81:330, 1980. WITH KIND PERMISSION FROM SPRINGER SCIENCE⫹ BUSINESS MEDIA.)

(a)

(b)

(c)

The complex formed by a pair of synapsed homologous chromosomes is called a bivalent or a tetrad. The former term reflects the fact that the complex contains two homologues, whereas the latter term calls attention to the presence of four chromatids. The end of synapsis marks the end of zygotene and the beginning of the next stage of prophase I, called pachytene, (Figure 14.44a), which is characterized by a fully formed synaptonemal complex. During pachytene, the homologues are held closely together along their length by the SC. The DNA of sister chromatids is extended into parallel loops (Figure 14.44b). Under the electron microscope, a number of

Figure 14.45 Visible evidence of crossing-over. (a,b) Diplotene bivalents from the grasshopper showing the chiasmata formed between chromatids of each homologous chromosome. The accompanying inset indicates the crossovers that have presumably occurred within the bivalent in a. The chrsmatids of each diplotene chromosome are closely apposed except at the chiasmata. (c) Scanning electron micrograph of a bivalent from the desert locust with three chiasmata (arrows). (A,B: FROM BERNARD JOHN, MEIOSIS, © 1990 CAMBRIDGE UNIVERSITY PRESS. REPRINTED WITH PERMISSION. C: FROM KLAUS WERNER WOLF, BIOESS. 16:108, 1994. REPRINTED WITH PERMISSION OF JOHN WILEY & SONS.)

607

Metaphase I (a)

Anaphase I (b)

Cohesin

Kinetochore

Kinetochore

Metaphase II (c)

Anaphase II (d)

Figure 14.46 Separation of homologous chromosomes during meiosis I and separation of chromatids during meiosis II. (a) Schematic diagram of a pair of homologous chromosomes at metaphase I. The chromatids are held together along both their arms and centromeres by cohesin. The pair of homologues are maintained as a bivalent by the chiasmata. Inset micrograph shows that the kinetochores (arrowheads) of sister chromatids are situated on one side of the chromosome, facing the same pole. The black dots are gold particles bound to the motor protein CENP-E of the kinetochores (see Figure 14.16c). (b) At anaphase I, the cohesin holding the arms of the chromatids is cleaved, allowing the homologues to separate from one another. Cohesin remains at the centromere, holding the chromatids together. (c) At metaphase II, the chromatids are held together at the centromere, with microtubules from opposite poles attached to the two kinetochores. Inset micrograph shows the kinetochores of the sister chromatids are now on opposite sides of the chromosome, facing opposite poles. (d ) At anaphase II, the cohesin holding the chromatids together has been cleaved, allowing the chromosomes to move to opposite poles. (INSETS: FROM JIBAK LEE ET AL., MOL. REPROD. DEVELOP. 56:51, 2000. REPRINTED WITH PERMISSION OF JOHN WILEY & SONS.)

tween sister chromatids in regions that flank these sites of recombination (Figure 14.46a). The chiasmata disappear at the metaphase I–anaphase I transition, as the arms of the chromatids of each bivalent lose cohesion (Figure 14.46b). Loss of cohesion between the arms is accomplished by proteolytic cleavage of the cohesin molecules in those regions of the chromosome. In contrast, cohesion between the joined centromeres of sister chromatids remains strong, because the cohesin situated there is protected from proteolytic attack (Figure 14.46b). As a result, sister chromatids remain firmly attached to one another as they move together toward a spindle pole during anaphase I.

14.3 Meiosis

ment provide a striking visual portrayal of the extent of genetic recombination. The chiasmata are made more visible by a tendency for the homologues to separate from one another at the diplotene stage. In vertebrates, diplotene can be an extremely extended phase of oogenesis during which the bulk of oocyte growth occurs. Thus diplotene can be a period of intense metabolic activity. Transcription during diplotene in the oocyte provides the RNA utilized for protein synthesis during both oogenesis and early embryonic development following fertilization. During the final stage of meiotic prophase I, called diakinesis, the meiotic spindle is assembled and the chromosomes are prepared for separation. In those species in which the chromosomes become highly dispersed during diplotene, the chromosomes become recompacted during diakinesis. Diakinesis ends with the disappearance of the nucleolus, the breakdown of the nuclear envelope, and the movement of the tetrads to the metaphase plate. In vertebrate oocytes, these events are triggered by an increase in the level of the protein kinase activity of MPF (maturation-promoting factor). As discussed in the Experimental Pathways at the end of the chapter, MPF was first identified by its ability to initiate these events, which represent the maturation of the oocyte (page 611). In most eukaryotic species, chiasmata can still be seen in homologous chromosomes aligned at the metaphase plate of meiosis I. In fact, chiasmata are required to hold the homologues together as a bivalent during this stage. In humans and other vertebrates, every pair of homologues typically contains at least one chiasma, and the longer chromosomes tend to have two or three of them. It is thought that some mechanism exists to ensure that even the smallest chromosomes form a chiasma. If a chiasma does not occur between a pair of homologous chromosomes, the chromosomes of that bivalent tend to separate from one another after dissolution of the SC. This premature separation of homologues often results in the formation of nuclei with an abnormal number of chromosomes. The consequences of such an event are discussed in the accompanying Human Perspective. At metaphase I, the two homologous chromosomes of each bivalent are connected to the spindle fibers from opposite poles (Figure 14.46a). In contrast, sister chromatids are connected to microtubules from the same spindle pole, which is made possible by the side-by-side arrangement of their kinetochores as seen in the inset of Figure 14.46a. The orientation of the maternal and paternal chromosomes of each bivalent on the metaphase I plate is random; the maternal member of a particular bivalent has an equal likelihood of facing either pole. Consequently, when homologous chromosomes separate during anaphase I, each pole receives a random assortment of maternal and paternal chromosomes (see Figure 14.39). Thus, anaphase I is the cytological event that corresponds to Mendel’s law of independent assortment (page 388). As a result of independent assortment, organisms are capable of generating a nearly unlimited variety of gametes. Separation of homologous chromosomes at anaphase I requires the dissolution of the chiasmata that hold the bivalents together. The chiasmata are maintained by cohesion be-

608

Telophase I of meiosis I produces less dramatic changes than telophase of mitosis. Although chromosomes often undergo some dispersion, they do not reach the extremely extended state of the interphase nucleus. The nuclear envelope may or may not reform during telophase I. The stage between the two meiotic divisions is called interkinesis and is generally short-lived. In animals, cells in this fleeting stage are referred to as secondary spermatocytes or secondary oocytes. These cells are characterized as being haploid because they contain only one member of each pair of homologous chromosomes. Even though they are haploid, they have twice as much DNA as a haploid gamete because each chromosome is still represented by a pair of attached chromatids. Secondary spermatocytes are said to have a 2C amount of DNA, half as much as a primary spermatocyte, which has a 4C DNA content, and twice as much as a sperm cell, which has a 1C DNA content. Interkinesis is followed by prophase II, a much simpler prophase than its predecessor. If the nuclear envelope had reformed in telophase I, it is broken down again. The chromosomes become recompacted and line up at the metaphase

T H E

H U M A N

plate. Unlike metaphase I, the kinetochores of sister chromatids of metaphase II face opposite poles and become attached to opposing sets of chromosomal spindle fibers (Figure 14.46c). The progression of meiosis in vertebrate oocytes stops at metaphase II. The arrest of meiosis at metaphase II is brought about by factors that inhibit APCCdc20 activation, thereby preventing cyclin B degradation. As long as cyclin B levels remain high within the oocyte, Cdk activity is maintained, and the cells cannot progress to the next meiotic stage. Metaphase II arrest is released only when the oocyte (now called an egg) is fertilized. Fertilization leads to a rapid influx of Ca2⫹ ions, the activation of APCCdc20 (page 592), and the destruction of cyclin B. The fertilized egg responds to these changes by completing the second meiotic division. Anaphase II begins with the synchronous splitting of the centromeres, which had held the sister chromatids together, allowing them to move toward opposite poles of the cell (Figure 14.46d ). Meiosis II ends with telophase II, in which the chromosomes are once again enclosed by a nuclear envelope. The products of meiosis are haploid cells with a 1C amount of nuclear DNA.

P E R S P E C T I V E

Chapter 14 Cellular Reproduction

Meiotic Nondisjunction and Its Consequences Meiosis is a complex process, and meiotic mistakes in humans appear to be surprisingly common. Homologous chromosomes may fail to separate from each other during meiosis I, or sister chromatids may fail to come apart during meiosis II. When either of these situations occurs, gametes are formed that contain an abnormal number of chromosomes—either an extra chromosome or a missing chromosome (Figure 1). If one of these gametes happens to fuse with a normal gamete, a zygote with an abnormal number of chromosomes forms, and serious consequences arise. In most cases, the zygote develops into an abnormal embryo that dies at some stage between conception and birth. In a few cases, however, the zygote develops into an infant whose cells have an abnormal chromosome number, a condition known as aneuploidy. The consequences of aneuploidy depend on which chromosome or chromosomes are affected. The normal human chromosome complement is 46: 22 pairs of autosomes and one pair of sex chromosomes. An extra chromosome (producing a total of 47 chromosomes) creates a condition referred to as a trisomy (Figure 2). A person whose cells contain an extra chromosome 21, for example, has trisomy 21. A missing chromosome (producing a total of 45 chromosomes) produces a monosomy. We will begin by considering the effects of an abnormal number of autosomes. Figure 1 Meiotic nondisjunction occurs when chromosomes fail to separate from each other during meiosis. If the failure to separate occurs during the first meiotic division, which is called primary nondisjunction, all of the haploid cells have an abnormal number of chromosomes. If nondisjunction occurs during the second meiotic division, which is called secondary nondisjunction, only two of the four haploid cells are affected. (More complex types of nondisjunction than shown here can also occur.)

The absence of one autosomal chromosome, regardless of which chromosome is affected, invariably proves to be lethal at some stage during embryonic or fetal development. Consequently, a zy-

First meiotic division Nondisjunction

Normal

Second meiotic division Normal

Normal

Gametes with extra chromosome

Gametes missing a chromosome

Normal

Nondisjunction

Gamete Normal with extra chrogametes mosome

Gamete missing a chromosome

609

Figure 2 The karyotype of a person with Down syndrome. The karyotype shows an extra chromosome 21 (trisomy 21). (PHANIE/ PHOTO RESEARCHERS, INC.)

a

Although chromosome nondisjunction does not appear to increase during spermatogenesis in relation to the age of the father, the number of new mutations that are passed on to a child by the father does apparently increase as a man ages. This finding is linked to an apparent increase in the risk of autism in children born to older fathers (see Nature 488:471, 2012).

14.3 Meiosis

gote containing an autosomal monosomy does not give rise to a fetus that is carried to term. Although one might not expect that possession of an extra chromosome would create a life-threatening condition, the fact is that trisomies do not fare much better than monosomic zygotes. Of the 22 different autosomes in the human chromosome complement, only persons with trisomy 21 can survive beyond the first few weeks or months of life. Most of the other possible trisomies are lethal during development, whereas trisomies for chromosomes 13 and 18 are often born alive but have such severe abnormalities that they succumb soon after birth. More than onequarter of fetuses that spontaneously abort carry a chromosomal trisomy. It is thought that many more zygotes carrying abnormal chromosome numbers produce embryos that die at an early stage of development before the pregnancy is recognized. For example, for every trisomic zygote formed at fertilization, there is presumably an equal number of monosomic zygotes that fare even less well. It is estimated that approximately 20 to 25 percent of human oocytes are aneuploid, which is much higher than any other species that has been studied. Meiosis in males occurs with a much lower level of chromosomal abnormalities than in females. Not all of the aneuploidy that occurs during human development necessarily begins with the zygote. One recent study of early embryos formed by in vitro fertilization (IVF) found that most of these embryos contained some cells (called blastomeres) with an aneuploid karyotype. Aneuploid cells were present along with normal cells in the same embryo suggesting (1) that chromosome nondisjunction occurs frequently during mitotic divisions of early embryonic cells and 2) that these abnormal cells are somehow destroyed during early development, owing to the fact that babies born as a result of in vitro fertilization do not exhibit an abnormally high level of chromosome abnormalities. Because it is impossible to study the chromosome complement of cells from human embryos conceived in vivo, we don’t know whether the occurrence of mitotic nondisjunction seen in this study is a phenomenon restricted to embryos generated by IVF. Even though chromosome 21 is the smallest human chromosome with fewer than 400 genes, the presence of an extra copy of this genetic material leads to Down syndrome. Persons with Down syndrome exhibit varying degrees of mental impairment, alteration in certain body features, circulatory problems, increased susceptibility to infectious diseases, a greatly increased risk of developing leukemia, and the early onset of Alzheimer’s disease. All of these

medical problems are thought to result from higher-than-normal level of expression of genes located on chromosome 21. Moreover, the expression of genes on other chromosomes can also be affected by the excess levels of certain transcription factors encoded by genes on chromosome 21. The presence of an abnormal number of sex chromosomes is much less disruptive to human development. A zygote with only one X chromosome and no second sex chromosome (denoted as XO) develops into a female with Turner syndrome, in which genital development is arrested in the juvenile state, the ovaries fail to develop, and body structure is slightly abnormal. Because a Y chromosome is male determining, persons with at least one Y chromosome develop as males. A male with an extra X chromosome (XXY) develops Klinefelter syndrome, which is characterized by mental retardation, underdevelopment of genitalia, and the presence of feminine physical characteristics (such as breast enlargement). A zygote with an extra Y (XYY) develops into a physically normal male who is likely to be taller than average. Considerable controversy developed surrounding claims that XYY males tend to exhibit more aggressive, antisocial, and criminal behavior than do XY males, but this hypothesis has never been substantiated. The likelihood of having a child with Down syndrome rises dramatically with the age of the mother—from 0.05 percent for mothers 19 years of age to greater than 3 percent for women over the age of 45. Most studies show that no such correlation is found between the age of the father and the likelihood of bearing a child with trisomy 21.a Estimates based on comparisons of DNA sequences between the offspring and the parents, indicate that approximately 95 percent of trisomies 21 can be traced to nondisjunction having occurred in the mother. It was noted above that an abnormal chromosome number can result from nondisjunction at either of the two meiotic divisions (Figure 1). Although these different nondisjunction events produce the same effect in terms of chromosome numbers in the zygote, they can be distinguished by genetic analysis. Primary nondisjunction transmits two homologous chromosomes to the zygote, whereas secondary nondisjunction transmits two sister chromatids (most likely altered by crossing over) to the zygote. Studies indicate that most of the mistakes occur during meiosis I. For example, in one study of 433 cases of trisomy 21 that resulted from maternal nondisjunction, 373 were the result of errors that had occurred during meiosis I and 60 were the result of errors during meiosis II. Why should meiosis I be more susceptible to nondisjunction than meiosis II? We don’t know the precise answer to that question, but it almost certainly reflects the fact that oocytes of older women have remained arrested in meiosis I for a very long period within the ovary. It was noted in the chapter that chiasmata—which are visual indicators of genetic recombination—play an important role in holding a bivalent together during metaphase I. According to one hypothesis, meiotic spindles of older oocytes are less able to hold together weakly constructed bivalents (e.g., bivalents with only one chiasma located near the tip of the chromosome) than those of younger oocytes, increasing the likelihood that homologous chromosomes will missegregate at anaphase I. Another possibility is that sister chromatid cohesion, which prevents the chiasmata from “sliding off ” the end of the chromosome, is not fully maintained over an extended period, allowing homologues to separate prematurely.

610

Genetic Recombination During Meiosis In addition to reducing the chromosome number as required for sexual reproduction, meiosis increases the genetic variability in a population of organisms from one generation to the next. Independent assortment allows maternal and paternal chromosomes to become shuffled during formation of the gametes, and genetic recombination (crossing-over) allows maternal and paternal alleles on a given chromosome to become shuffled as well (see Figure 14.39). Without genetic recombination, the alleles along a particular chromosome would remain tied together, generation after generation. By mixing maternal and paternal alleles between homologous chromosomes, meiosis generates organisms with novel genotypes and phenotypes on which natural selection can act (see Figure 10.7 for an example from the fruit fly). Recombination involves the physical breakage of individual DNA molecules and the ligation of the split ends from one DNA duplex with the split ends of the duplex from the homologous chromosome. Recombination is a remarkably precise process that normally occurs without the addition or loss of a single base pair. To occur so faithfully, recombination depends

1

5' 3' 3' 5'

3' 5' 5' 3'

3' 3' 2

3

4

Holliday junction

Chapter 14 Cellular Reproduction

5

6

Noncrossover 7

Crossover

Figure 14.47 A proposed mechanism for genetic recombination initiated by double-strand breaks. The steps are described in the text.

on the complementary base sequences that exist between a single strand from one chromosome and the homologous strand of another chromosome, as discussed below. The precision of recombination is further ensured by the involvement of DNA repair enzymes that fill gaps that develop during the exchange process. A simplified model of the proposed steps that occur during recombination in eukaryotic cells is shown in Figure 14.47. In this model, two DNA duplexes that are about to recombine become aligned next to one another as the result of some type of homology search in which homologous DNA molecules associate with one another in preparation for recombination. Once they are aligned, an enzyme (Spo11) introduces a double-stranded break into one of the duplexes (step 1, Figure 14.47). The gap is subsequently widened (resected) as indicated in step 2. Resection may occur by the action of a 5⬘ n 3⬘ exonuclease or by an alternate mechanism. Regardless, the broken strands possess exposed singlestranded tails, each bearing a 3⬘ OH terminus. In the model shown in Figure 14.47, one of the single-stranded tails leaves its own duplex and invades the DNA molecule of a nonsister chromatid, hydrogen bonding with the complementary strand in the neighboring duplex (step 3). In E. coli, this process in which a single strand invades an intact homologous duplex and displaces the corresponding strand in that duplex is catalyzed by a recombinase enzyme, called the RecA protein. The RecA recombinase polymerizes along a length of singlestranded DNA forming a nucleoprotein filament. RecA enables the single-stranded DNA to search for and invade an homologous double helix. Eukaryotic cells have homologues of RecA (e.g., Rad51) that are thought to catalyze strand invasion. Strand invasion activates a DNA repair activity (Section 13.3) that fills the gaps as shown in step 4. As a result of the reciprocal exchange of DNA strands, the two duplexes are covalently linked to one another to form a joint molecule (or heteroduplex) that contains a pair of DNA crossovers, or Holliday junctions, that flank the region of strand exchange (steps 4 and 5, Figure 14.47). These junctions are named after Robin Holliday, who proposed their existence in 1964. This type of recombination intermediate need not be a static structure because the point of linkage may move in one direction or another (an event known as branch migration) by breaking the hydrogen bonds holding the original pairs of strands and reforming hydrogen bonds between strands of the newly joined duplexes (step 5). Formation of Holliday junctions and branch migration occur during pachytene (Figure 14.42). To resolve the interconnected DNA molecules of the Holliday junctions and restore the DNA back to two separate duplexes, another round of DNA cleavage must occur. Depending on the particular DNA strands that are cleaved and ligated, two alternate products can be generated. In one case, the two duplexes contain only short stretches of genetic exchange, which represents a noncrossover (step 6, Figure 14.47). In the alternate pathway of breakage and ligation, the duplex of one DNA molecule is covalently joined to the duplex of the homologous molecule, creating a site of genetic recombination (i.e., a crossover) (step 7). The decision as to whether a recombinational interaction will result in a crossover or a noncrossover is thought to occur long before the

611

stage when the double Holliday junction is actually resolved. Crossovers, which represent the fusion of a maternal and paternal chromosome (step 7), develop into the chiasmata required to hold the homologues together during meiosis I (page 606). In mammals, the number of noncrossovers greatly outnumbers the number of crossovers. Moreover, as noted on page 419, crossovers occur in certain regions of the genome (recombination “hotspots”) with a much higher frequency than in other regions. Recombination hotspots average about 200 base pairs in length, but it is not entirely clear what distinguishes these sites from the remainder of the genome. Evidence suggests that differences in both DNA sequence and chromatin structure (e.g., nucleosome density, histone modifications, and DNA-bound proteins) are likely to play a role in determining sites where double-strand breaks are introduced.

REVIEW 1. Contrast the overall roles of mitosis and meiosis in the lives of a plant or animal. How do the nuclei formed by these two processes differ from one another? 2. Contrast the events that take place during prophase I and prophase II of meiosis. 3. Contrast the timing of meiosis in spermatogenesis versus oogenesis. 4. What is the role of DNA strand breaks in genetic recombination?

E X P E R I M E N TA L

P AT H W AY S

The Discovery and Characterization of MPF As an amphibian oocyte nears the end of oogenesis, the large nucleus (called a germinal vesicle) moves toward the periphery of the cell. In subsequent steps, the nuclear envelope disassembles, the compacted chromosomes become aligned along a metaphase plate near one end (the animal pole) of the oocyte, and the cell undergoes the first meiotic division to produce a large secondary oocyte and a small polar body. The processes of germinal vesicle breakdown and first meiotic division are referred to as maturation and can be induced in fully grown oocytes by treatment with the steroid hormone progesterone. The first sign of maturation in the hormone-treated amphibian oocyte is seen 13–18 hours following progesterone treatment as the germinal vesicle moves near the oocyte surface. Germinal vesicle breakdown soon follows, and the oocyte reaches metaphase of the second meiotic division by about 36 hours after hormone treatment. Progesterone induces maturation only if it is applied to the external medium surrounding the oocyte; if the hormone is injected into the oocyte, the oocyte shows no response.1 It appears that the hormone acts at the cell surface to trigger secondary changes in the cytoplasm of the oocyte that lead to germinal vesicle breakdown and the other changes associated with maturation.

To learn more about the nature of the cytoplasmic change that was responsible for triggering maturation, Yoshio Masui of the University of Toronto and Clement Markert of Yale University began a series of experiments in which they removed cytoplasm from isolated frog oocytes at various stages following progesterone treatment and injected 40–60 nanoliters (nl) of the donor cytoplasm into fully grown, immature oocytes that had not been treated with the hormone.2 They found that cytoplasm taken from oocytes during the first 12 hours following progesterone treatment had little or no effect on recipient oocytes. After this period, however, the cytoplasm gained the ability to induce maturation in the recipient oocyte. The cytoplasm from the donor oocyte was maximally effective about 20 hours after progesterone treatment, and its effectiveness declined by 40 hours (Figure 1). However, cytoplasm taken from early embryos continued to show some ability to induce oocyte maturation. Masui and Markert referred to the cytoplasmic substance(s) that induce maturation in recipient oocytes as “maturation promoting factor,” which became known as MPF. Because it was assumed that MPF was involved specifically in triggering oocyte maturation, relatively little interest was paid at first to the substance or its possible mechanism of action. Then in 1978,

%/nl 2.5

17–24

ni per oocyte

29–33 2.0

35–44 50–51

unfertilized egg

55–63 1.5

110–120 2-cell 4-128 cell

1.0

Figure 1 Change of activity of the maturation-promoting factor in the oocyte cytoplasm of Rana pipiens during the course of maturation and early development. Ordinate: the ratio of frequency of induced maturation to volume of injected cytoplasm. The higher the ratio, the more effective the cytoplasm. nl, nanoliters of injected cytoplasm. Abscissa: age of the donors (hours after administration of progesterone). (Y. MASUI AND C. L. MARKERT, JOURNAL EXP ZOOLOGY 177:142, 1971. REPRINTED WITH PERMISSION FROM JOHN WILEY & SONS PUBLISHERS, INC.)

0.5

10

20

30 HOURS

40

50

60h

14.3 Meiosis

0

612 William Wasserman and Dennis Smith of Purdue University published a report on the behavior of MPF during early amphibian development.3 It had been assumed that MPF activity present in early embryos was simply a residue of activity that had been present in the oocyte. But Wasserman and Smith discovered that MPF activity undergoes dramatic fluctuations in cleaving eggs that correlate with changes in the cell cycle. It was found, for example, that cytoplasm taken from cleaving frog eggs within 30–60 minutes after fertilization contains little or no detectable MPF activity, as assayed by injection into immature oocytes (Figure 2). However, if cytoplasm is taken from an egg at 90 minutes after fertilization, MPF activity can again be demonstrated. MPF activity reaches a peak at 120 minutes after fertilization and starts to decline again at 150 minutes (Figure 2). At the time the eggs undergo their first cytokinesis at 180 minutes, no activity is detected in the eggs. Then, as the second cleavage cycle gets underway, MPF activity once again reappears, reaching a peak at 225 minutes postfertilization, and then declines again to a very low level. Similar results were found in Xenopus eggs, except that the fluctuations in MPF activity occur more rapidly than in Rana and correlate with the more rapid rate of cleavage divisions in the early Xenopus embryo. Thus, MPF activity disappears and reappears in both amphibian species on a time scale that correlates with the length of the cell cycle. In both species, the peak of MPF activity corresponds to the time of nuclear membrane breakdown and the entry of the cells into mitosis. These findings suggested that MPF does more than simply control the time of oocyte maturation and, in fact, may play a key role in regulating the cell cycle of dividing cells. It became apparent about this same time that MPF activity is not limited to amphibian eggs and oocytes but is present in a wide variety of organisms. It was found, for example, that mammalian cells growing in culture also possess MPF activity as assayed by the ability

of mammalian cell extracts to induce germinal vesicle breakdown when injected into amphibian oocytes.4 MPF activity of mammalian cells fluctuates with the cell cycle, as it does in dividing amphibian eggs. Extracts from cultured HeLa cells prepared from early G1-, late G1-, or S-phase cells lack MPF activity (Figure 3). MPF appears in early G2, rises dramatically in late G2, and reaches a peak in mitosis. Another element of the machinery that regulates the cell cycle was discovered in studies on sea urchin embryos. Sea urchin eggs are favorite subjects for studies of cell division because the mitotic divisions following fertilization occur rapidly and are separated by highly predictable time intervals. If sea urchin eggs are fertilized in seawater containing an inhibitor of protein synthesis, the eggs fail to undergo the first mitotic division, arresting at a stage prior to chromosome compaction and breakdown of the nuclear envelope. Similarly, each of the subsequent mitotic divisions can also be blocked if an inhibitor of protein synthesis is added to the medium at a time well before the division would normally occur. This finding had suggested that one or more proteins must be synthesized during each of the early cell cycles if the ensuing mitotic division is to occur. But early studies on cleaving sea urchin eggs failed to reveal the appearance of new species of proteins during this period. In 1983, Tim Hunt and his colleagues at the Marine Biological Laboratory at Woods Hole reported on several proteins that are synthesized in fertilized sea urchin eggs but not unfertilized eggs.5 To study these proteins further, they incubated fertilized eggs in seawater containing [35S]methionine and withdrew samples at 10-minute intervals beginning at 16 minutes after fertilization. Crude protein extracts were prepared from the samples and subjected to polyacrylamide gel electrophoresis, and the labeled proteins were located autoradiographically. Several prominent bands were labeled in gels from fertilized egg extracts that were not evident in comparable extracts made

100 100

1st

Chapter 14 Cellular Reproduction

2nd % GVBD per 228 ng protein

% RECIPIENT RESPONSE

80

60

40

20

60

120

180

240

300

TIME POSTFERTILIZATION (min)

Figure 2 Cycling of MPF activity in fertilized R. pipiens eggs. Ordinate: percent recipient oocytes undergoing germinal vesicle breakdown in response to 80 nanoliters of cytoplasm from fertilized eggs. Abscissa: time after fertilization when the Rana egg cytoplasm was assayed for MPF activity. Arrows indicate time of cleavage divisions. (W. J. WASSERMAN AND L. D. SMITH, JOURNAL CELL BIOLOGY 78:R17, 1978. THE JOURNAL OF CELL BIOLOGY BY ROCKEFELLER INSTITUTE ; AMERICAN SOCIETY FOR CELL BIOLOGY COPYRIGHT 1978 REPRODUCED WITH PERMISSION OF ROCKEFELLER UNIVERSITY PRESS IN THE FORMAT REPUBLISH IN A TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

75

50

25

0 E

L G1

E S

M

L

G2

Mitosis

G1

Phase of cell cycle

Figure 3 Maturation-promoting activity of HeLa cell extracts during different stages of the cell cycle. Because 228 ng of mitotic protein induced germinal vesicle breakdown (GVBD) in 100 percent of the cases, the percent activity for other phases of the cell cycle was normalized to that amount of protein. E, early; M, mid-, and L, late. (P. S. SUNKARA, D. A. WRIGHT AND P. N. RAO, PROC. NATL ACAD. SCIENCE USA 76:2801, 1979.)

613 100

50

60

40

Progesterone 25 ng RNA 5 ng RNA 2.5 ng RNA

)&B(

Cleavage index

B

80

GVBD (%)

Intensity of bands A (

75

A

25

20

) 5

0

1

2 hours

10

15

20

25

Hours

Figure 5 Kinetics of Xenopus oocyte activation by progesterone and cyclin A mRNA. Large, immature oocytes were isolated from fragments of ovaries and were incubated with progesterone or microinjected with varying amounts of cyclin A mRNA. At 3 to 4 hours after injection, obviously damaged oocytes were removed (about 2–4 per starting group of 20), and the remaining ones (which represent the 100% value) were allowed to develop. Germinal vesicle breakdown (GVBD) and oocyte activation were indicated by the formation of a white spot in the region of the animal pole and were confirmed by dissection of the oocytes. (FROM K. I. SWENSON, K. M. FARRELL, AND J. V. RUDERMAN, CELL 47:865, 1986. CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

from unfertilized eggs. One of the bands that appeared strongly labeled at early stages after fertilization virtually disappeared from the gel by 85 minutes after fertilization, suggesting that the protein had been selectively degraded. This same band then reappeared in gels from eggs sampled at later times and disappeared once again in a sample taken at 127 minutes after fertilization. The fluctuations in the amount of this protein are plotted in Figure 4 (protein band A) together with the cleavage index, which indicates the time course of the first two cell divisions. The degradation of the protein occurs at about the same time that the cells undergo the first and second division. A similar protein was found in the eggs of the surf clam, another invertebrate whose eggs are widely studied. Hunt and colleagues named the protein “cyclin” and noted the striking parallel in behavior between the fluctuations in cyclin levels in their investigation and MPF activity in the earlier studies. Subsequent studies showed that there were two distinct cyclins, A and B, which are degraded at different times during the cell cycle. Cyclin A is degraded during a 5–6 minute period beginning just before the metaphase–anaphase transition, and cyclin B is degraded a few minutes after this transition. These studies provided the first indication of the importance of controlled proteolysis (page 578) in the regulation of a major cellular activity. The first clear link between cyclin and MPF was demonstrated by Joan Ruderman and her colleagues at the Woods Hole Marine Biological Laboratory.6 In these studies, an mRNA encoding cyclin A was transcribed in vitro from a cloned DNA fragment that contained the entire cyclin A coding sequence. The identity of this mRNA was verified by translating it in vitro and finding that it encoded authentic clam cyclin A. When the synthetic cyclin mRNA was injected into Xenopus oocytes, the cells underwent germinal vesicle

breakdown and chromosome compaction over a time course not unlike that induced by progesterone treatment (Figure 5). These results suggested that the rise in cyclin A, which occurs normally during meiosis and mitosis, has a direct role in promoting entry into M phase. The amount of cyclin A normally drops rapidly and must be resynthesized prior to the next division or the cells cannot reenter M phase. But what is the relationship between cyclins and MPF? One of the difficulties in answering this question was the use of different organisms. MPF had been studied primarily in amphibians, and cyclins in sea urchins and clams. Evidence indicated that frog oocytes contain a pool of inactive pre-MPF molecules, which are converted to active MPFs during meiosis I. Cyclin, on the other hand, is totally absent from clam oocytes but appears soon after fertilization. Ruderman considered the possibility that cyclin A is an activator of MPF. We will return to this shortly. Meanwhile, another line of research was initiated to purify and characterize the substance responsible for MPF activity. In 1980, Michael Wu and John Gerhart of the University of California, Berkeley, accomplished a 20- to 30-fold purification of MPF by precipitating the protein in ammonium sulfate and subjecting the redissolved material to column chromatography. In addition to stimulating oocyte maturation, injections of the partially purified MPF stimulated the incorporation of 32P into proteins of the amphibian oocyte.7 When partially purified MPF preparations were incubated with [32P]ATP in vitro, proteins present within the sample became phosphorylated, suggesting that MPF induced maturation by acting as a protein kinase. MPF was finally purified in 1988 by a series of six successive chromatographic steps.8 MPF activity in these purified preparations

14.3 Meiosis

Figure 4 Correlation of the level of cyclin with the cell division cycle. A suspension of eggs was fertilized, and after 6 minutes, [35S] methionine was added. Samples were taken for analysis by gel electrophoresis at 10-minute intervals, starting at 16 minutes after fertilization. The autoradiograph of the electrophoretic gel was scanned for label density, and the data were plotted as shown. Protein A, which varies according to the cell cycle and is named cyclin, is shown as solid circles. Protein B (not to be confused with cyclin B), which shows no cell cycle fluctuation, is plotted as solid triangles. The percentage of cells undergoing division at any given time period is given as the cleavage index (open squares). (T. EVANS ET AL., CELL 33:391, 1983. CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/ TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

614 was consistently associated with two polypeptides, one having a molecular mass of 32 kDa and the other of 45 kDa. The purified MPF preparation possessed a high level of protein kinase activity, as determined by the incorporation of radioactivity from [32P]ATP into proteins. When the purified preparation was incubated in the presence of [32P]ATP, the 45-kDa polypeptide became labeled. By the end of the 1980s, the efforts to uncover the role of cyclins and MPF had begun to merge with another line of research that had been conducted on fission yeast by Paul Nurse and his colleagues at the University of Oxford.9 It had been shown that yeast produced a protein kinase with a molecular weight of 34 kDa whose activity was required for these cells to enter M phase (discussed on page 564). The yeast protein was called p34cdc2 or simply cdc2. The first evidence of a link between cdc2 and MPF came as the result of a collaboration between yeast and amphibian research groups.10,11 Recall from the previous study that MPF was found to contain a 32and 45-kDa protein. Antibodies formed against cdc2 from fission yeast were shown to react specifically with the 32-kDa component of MPF isolated from Xenopus eggs. These findings indicate that this component of MPF is a homologue of the 34-kDa yeast kinase and, therefore, that the machinery controlling the cell cycle in yeast and vertebrates contains evolutionarily conserved components. A similar study using antibodies against yeast cdc2 showed that the homologous protein in vertebrates does not fluctuate during the cell cycle.12 This supports the proposal that the 32-kDa protein kinase in vertebrate cells depends on another protein. The modulator was predicted to be cyclin, which rises in concentration during each cell cycle and is then destroyed as cells enter anaphase. This proposal was subsequently verified in a number of studies in which the MPF was purified from amphibians, clams, and starfish, and its polypeptide composition analyzed.13–15 In all of these cases, it was shown that the active MPF present in M-phase animal cells is a complex consisting of two types of subunits: (1) a 32-kDa subunit that contains the protein kinase active site and is homologous to the yeast cdc2 protein kinase, and (2) a larger subunit (45 kDa) identified as a cyclin whose presence is required for kinase activity. The studies described in this Experimental Pathway provided a unified view of the regulation of the cell cycle in all eukaryotic organisms. In addition, they set the stage for analysis of the numerous factors that control the activity of MPF (cdc2) at various points during yeast and mammalian cell cycles, which have become the focus of attention in the

past few years. Many of the most important findings of these recent studies are discussed in the first section of this chapter.

References 1. SMITH, L. D. & ECKER, R. E. 1971. The interaction of steroids with R. pipiens oocytes in the induction of maturation. Dev. Biol. 25:233–247. 2. MASUI, Y. & MARKERT, C. L. 1971. Cytoplasmic control of nuclear behavior during meiotic maturation of frog oocytes. J. Exp. Zool. 177:129–146. 3. WASSERMAN, W. J. & SMITH, L. D. 1978. The cyclic behavior of a cytoplasmic factor controlling nuclear membrane breakdown. J. Cell Biol. 78:R15–R22. 4. SUNKARA, P. S., WRIGHT, D. A., & RAO, P. N. 1979. Mitotic factors from mammalian cells induce germinal vesicle breakdown and chromosome condensation in amphibian oocytes. Proc. Nat’l. Acad. Sci. U.S.A. 76:2799–2802. 5. EVANS, T., ET AL. 1983. Cyclin: A protein specified by maternal mRNA in sea urchin eggs that is destroyed at each cleavage division. Cell 33:389–396. 6. SWENSON, K. I., FARRELL, K. M., & RUDERMAN, J. V. 1986. The clam embryo protein cyclin A induces entry into M phase and the resumption of meiosis in Xenopus oocytes. Cell 47:861–870. 7. WU, M. & GERHART, J. C. 1980. Partial purification and characterization of the maturation-promoting factor from eggs of Xenopus laevis. Dev. Biol. 79:465–477. 8. LOHKA, M. J., HAYES, M. K., & MALLER, J. L. 1988. Purification of maturation-promoting factor, an intracellular regulator of early mitotic events. Proc. Nat’l. Acad. Sci. U.S.A. 85:3009–3013. 9. NURSE, P. 1990. Universal control mechanism regulating onset of M-phase. Nature 344:503–507. 10. GAUTIER, J., ET AL. 1988. Purified maturation-promoting factor contains the product of a Xenopus homolog of the fission yeast cell cycle control gene cdc2 1. Cell 54:433–439. 11. DUNPHY, W. G., ET AL. 1988. The Xenopus cdc2 protein is a component of MPF, a cytoplasmic regulator of mitosis. Cell 54:423–431. 12. LABBE, J. C., ET AL. 1988. Activation at M-phase of a protein kinase encoded by a starfish homologue of the cell cycle control gene cdc2. Nature 335:251–254. 13. LABBE, J. C., ET AL. 1989. MPF from starfish oocytes at first meiotic metaphase is a heterodimer containing one molecule of cdc2 and one molecule of cyclin B. EMBO J. 8:3053–3058. 14. DRAETTA, G., ET AL. 1989. cdc2 protein kinase is complexed with both cyclin A and B: Evidence for proteolytic inactivation of MPF. Cell 56:829–838. 15. GAUTIER, J., ET AL. 1990. Cyclin is a component of maturation-promoting factor from Xenopus. Cell 60:487–494.

Chapter 14 Cellular Reproduction

| Synopsis The stages through which a cell passes from one cell division to the next constitute the cell cycle. The cell cycle is divided into two major phases: M phase, which includes the process of mitosis, in which duplicated chromosomes are separated into two nuclei, and cytokinesis, in which the entire cell is physically divided into two daughter cells; and interphase. Interphase is typically much longer than M phase and is subdivided into three distinct phases based on the time of replication, which is confined to a definite period within the cell cycle. G1 is the period following mitosis and preceding replication; S is the period during which DNA synthesis occurs; and G2 is the period following replication and preceding the onset of mitosis. The length of the cell cycle, and the phases of which it is made up, vary greatly from one type of cell to another. Certain types of terminally differentiated cells, such as vertebrate skeletal muscle and nerve cells, have lost the ability to divide. (p. 573) Early studies showed that entry of a cell into M phase is triggered

by the activation of a protein kinase called MPF. MPF consists of two subunits: a catalytic subunit that transfers phosphate groups to specific serine and threonine residues of specific protein substrates, and a regulatory subunit consisting of a member of a family of proteins called cyclins. The catalytic subunit is called a cyclindependent kinase (Cdk). When the cyclin concentration is low, the kinase lacks the cyclin subunit and is inactive. When the cyclin concentration reaches a sufficient level, the kinase is activated, triggering the entry of the cell into M phase. (p. 575) The activities that control the cell cycle are focused primarily at two points: the transition between G1 and S and the transition between G2 and the entry into mitosis. Passage through each of these points requires the transient activation of a Cdk by a specific cyclin. In yeast, the same Cdk is active at both G1–S and G2–M, but it is stimulated by different cyclins. The concentrations of the various cyclins rise and fall during the cell cycle as the result of changes

615 destruction of cohesin, a protein complex that holds the sisters together. The separated chromosomes then move toward their respective poles accompanied by the shortening of the attached chromosomal microtubules, which results from the net loss of subunits at both the poles and kinetochore. The movement of the chromosomes, which is called anaphase A, is typically accompanied by the elongation of the mitotic spindle and consequent separation of the poles, which is referred to as anaphase B. Telophase is characterized by the reformation of the nuclear envelope, the dispersal of the chromosomes, and the reformation of membranous cytoplasmic networks. (p. 592) Cytokinesis, which is the division of the cytoplasm into two daughter cells, occurs by constriction in animal cells and by construction in plant cells. Animal cells are constricted in two by an indentation or furrow that forms at the surface of the cell and moves inward. The advancing furrow contains a band of actin filaments, which slide over one another, driven by small, force-generating myosin II filaments. The site of cytokinesis is thought to be selected by a signal diffusing from the mitotic spindle. Plant cells undergo cytokinesis by building a cell membrane and cell wall in a plane lying between the two poles. The first sign of cell plate formation is the appearance of clusters of interdigitating microtubules and interspersed, electron-dense material. Small vesicles then move into the region and become aligned into a plane. The vesicles fuse with one another to form a membranous network that develops into a cell plate. (p. 597) Meiosis is a process that includes two sequential nuclear divisions, producing haploid daughter nuclei that contain only one member of each pair of homologous chromosomes, thus reducing the number of chromosomes in half. Meiosis can occur at different stages in the life cycle, depending on the type of organism. To ensure that each daughter nucleus has only one set of homologues, an elaborate process of chromosome pairing occurs during prophase I that has no counterpart in mitosis. The pairing of homologous chromosomes is accompanied by the formation of a proteinaceous, ladderlike structure called the synaptonemal complex (SC). The chromatin of each homologue is intimately associated with one of the lateral bars of the SC. During prophase I, homologous chromosomes engage in genetic recombination that produces chromosomes with new combinations of maternal and paternal alleles. Following recombination, the chromosomes become further condensed, and the homologues remain attached to one another at specific points called chiasmata, which represent sites of recombination. The paired homologues (called bivalents or tetrads) become oriented at the metaphase plate so that both chromatids of one chromosome face the same pole. During anaphase I, homologous chromosomes separate from one another with the maternal and paternal chromosomes of each tetrad segregating independently. The cells, which are now haploid in chromosome content, progress into the second meiotic division during which the sister chromatids of each chromosome separate into different daughter nuclei. (p. 602) Genetic recombination during meiosis occurs as the result of the breakage and reunion of DNA strands from different homologues of a tetrad. During recombination homologous regions on different DNA strands are exchanged without the addition or loss of a single base pair. In an initial step, two duplexes become aligned next to one another. Once they are aligned, breaks occur in both strands of one of the duplexes. In the following steps, a DNA strand from one duplex invades the other duplex, forming an interconnected structure. Subsequent steps include the activity of nucleases and polymerases to create and fill gaps in the strands similar to the way in which DNA repair occurs. (p. 610)

Synopsis

in the rate of synthesis and destruction of the protein molecules. In addition to regulation by cyclins, Cdk activity is also controlled by the state of phosphorylation of the catalytic subunit, which in turn is controlled by at least two kinases (CAK and Wee1) and a phosphatase (Cdc25). In mammalian cells, at least eight different cyclins and a half-dozen different Cdks play a role in cell cycle regulation. (p. 576) Cells possess surveillance mechanisms that monitor the status of cell cycle events, such as replication and chromosome compaction, and determine whether or not the cycle continues. If a cell is subjected to treatments that damage DNA, cell cycle progression is delayed until the damage is repaired. Arrest of a cell at one of the checkpoints of the cell cycle is accomplished by inhibitors whose synthesis is stimulated by events such as DNA damage. Once the cell has been stimulated by external agents to traverse the G1-S checkpoint and initiate replication, the cell generally continues without further external stimulation through the subsequent mitosis. (p. 579) Mitosis ensures that two daughter nuclei receive a complete and equivalent complement of genetic material. Mitosis is divided into prophase, prometaphase, metaphase, anaphase, and telophase. Prophase is characterized by the preparation of chromosomes for segregation and the assembly of the machinery required for chromosome movement. Mitotic chromosomes are highly compacted, rod-shaped structures. Each mitotic chromosome can be seen to be split longitudinally into two chromatids, which are duplicates of one another formed by replication during the previous S phase. The primary constriction on the mitotic chromosome marks the centromere, which houses a platelike structure, the kinetochore, to which the microtubules of the spindle attach. During spindle formation, the centrosomes move away from one another toward the poles. As this happens, the microtubules that stretch between them increase in number and elongate. Eventually, the two centrosomes reach opposite points in the cell, establishing the two poles. A number of types of cells, including those of plants, assemble a mitotic spindle in the absence of centrosomes. The end of prophase is marked by the rupture of the nuclear envelope. (p. 581) During prometaphase and metaphase, individual chromosomes are attached to spindle microtubules emanating from both poles and then moved to a plane at the center of the spindle. At the beginning of prometaphase, microtubules from the forming spindle penetrate into the region that had been the nucleus and make attachments to the kinetochores of the condensed chromosomes. The kinetochores soon become stably associated with the plus ends of microtubules from both poles of the spindle. Eventually, each chromosome is moved into position along a plane at the center of the spindle, a process that is accompanied by the shortening of some microtubules due to loss of tubulin subunits and the lengthening of others due to addition of subunits. Once the chromosomes are stably aligned, the cell has reached metaphase. The mitotic spindle of a typical animal cell at metaphase consists of astral microtubules, which radiate out from the centrosome; chromosomal microtubules, which are attached to the kinetochores; and polar microtubules that extend past the chromosomes and form a structural basket that maintains the integrity of the spindle. The microtubules of the metaphase spindle exhibit dynamic activity as demonstrated by the movement of fluorescently labeled subunits. (p. 588) During anaphase and telophase, sister chromatids are separated from one another into separate regions of the dividing cell, and the chromosomes are returned to their interphase condition. Anaphase begins as the sister chromatids suddenly split away from one another, a process that is triggered by the ubiquitin-mediated

616

| Analytic Questions 1. In what way is cell division a link between humans and the ear-

11. If the haploid number of chromosomes in humans is 23 and the

liest eukaryotic cells? What types of synthetic events would you expect to occur in G1 that do not occur in G2? Suppose you are labeling a population of cells growing asynchronously with [3H]thymidine. G1 is 6 hours, S is 6 hours, G2 is 5 hours, and M is 1 hour. What percentage of the cells would be labeled after a 15-minute pulse? What percentage of mitotic cells would be labeled after such a pulse? How long would you have to chase these cells before you saw any labeled mitotic chromosomes? What percentage of the cells would have labeled mitotic chromosomes if you chased the cells for 18 hours? Suppose you take a culture of the same cells used in the previous question, but instead of pulse-labeling them with [3H]thymidine, you labeled them continuously for 20 hours. Plot a graph showing the amount of radioactive DNA that would be present in the culture over the 20-hour period. What would be the minimum amount of time needed to ensure that all cells had some incorporated label? How could you determine the length of the cell cycle in this culture without use of a radioactive label? Fusing G1- and S-phase cells produces different results from fusing a G2-phase cell with one in S. What would you expect this difference to be and how can it be explained? (Hint: Look back at Figure 13.20, which shows that the initiation of replication requires formation of a prereplication complex, which can only form in G1.) Figure 14.6 shows the effect on the cell cycle of mutations in the genes that encode Wee1 and Cdc25. The kinase CAK was identified biochemically rather than genetically (i.e., by isolation of mutant cells). What phenotype would you expect in a yeast cell carrying a temperature-sensitive CAK mutation, after raising the temperature of the culture medium in early G1? in late G2? Why might it be different depending upon the stage at which the temperature was raised? Give four distinct mechanisms by which a Cdk can be in activated. A syncytium is a “cell” that contains more than one nucleus; examples are a skeletal muscle fiber and a blastula of a fly embryo. These two types of syncytia arise by very different pathways. What two mechanisms can you envision that could lead to formation of syncytia? What does this tell you about the relationship between mitosis and cytokinesis? How could you determine experimentally if the microtubules of the polar microtubules were in a state of dynamic flux during anaphase? Knowing the events that occur at this stage, what would you expect to see? If you were to add [3H]thymidine to a cell as it underwent replication (S phase) prior to beginning meiosis, what percentage of the chromosomes of the gametes produced would be labeled? If one of these gametes (a sperm) were to fertilize an unlabeled egg, what percentage of the chromosomes of the two-cell stage would be labeled?

amount of nuclear DNA in a sperm is 1C, how many chromosomes does a human cell possess in the following stages: metaphase of mitosis, prophase I of meiosis, anaphase I of meiosis I, prophase II of meiosis, anaphase II of meiosis. How many chromatids does the cell have at each of these stages? How much DNA (in terms of numbers of C) does the cell have at each of these stages? Plot the amount of DNA in the nucleus of a spermatogonia from the G1 stage prior to the first meiotic division through the completion of meiosis. Label each of the major stages of the cell cycle and of meiosis on the graph. How many centrioles does a cell have at metaphase of mitosis? Suppose you were told that most cases of trisomy result from aging of an egg in the oviduct as it awaits fertilization. What type of evidence could you obtain from examining spontaneously aborted fetuses that would confirm this suggestion? How would this fit with data that has already been collected? Suppose you incubate a meiotic cell in [3H]thymidine between the leptotene and zygotene stages, then you fix the cell during pachytene and prepare an autoradiograph. You find that the chiasmata are sites of concentrations of silver grains. What does this tell you about the mechanism of recombination? What type of phenotype would you expect of a cell whose Cdc20 polypeptide was mutated so that (1) it was unable to bind to Mad2, or (2) it was unable to bind to the other subunits of the APC, or (3) it failed to dissociate from the APC at the end of anaphase? Assume for a moment that crossing-over did not occur. Would you agree that you received half of your chromosomes from each parent? Would you agree that you received one-quarter of your chromosomes from each grandparent? Would the answer to these questions change if you allowed for crossing-over to have occurred? Fetuses whose cells are triploid, that is, contain three full sets of chromosomes, develop to term and die as infants, whereas fetuses with individual chromosome trisomies tend not to fare as well. How can you explain this observation? What type of phenotype would you expect of a fission yeast cell whose Cdk subunit was lacking each of the following residues as the result of mutation: Tyr 15, Thr 161? It was noted on page 586 that centrosome duplication and DNA synthesis are both initiated by cyclin E–Cdk2, which becomes active at the end of G1. A recent study found that if cyclin E–Cdk2 is activated at an earlier stage, such as the beginning of G1, that centrosome duplication begins at that point in the cell cycle, but that DNA replication is not initiated until S phase would normally begin. Provide a hypothesis to explain why DNA synthesis does not begin as well. You might look back at Figure 13.20 for further information.

2. 3.

4.

5.

6.

7. 8.

Chapter 14 Cellular Reproduction

9.

10.

12.

13. 14.

15.

16.

17.

18.

19.

20.

617

15

β2-AR

Gγ

Gβ

Gα

Cell Signaling and Signal Transduction: Communication Between Cells 15.1 The Basic Elements of Cell Signaling Systems 15.2 A Survey of Extracellular Messengers and Their Receptors 15.3 G Protein-Coupled Receptors and Their Second Messengers 15.4 Protein-Tyrosine Phosphorylation as a Mechanism for Signal Transduction 15.5 The Role of Calcium as an Intracellular Messenger 15.6 Convergence, Divergence, and Cross-Talk Among Different Signaling Pathways 15.7 The Role of NO as an Intercellular Messenger 15.8 Apoptosis (Programmed Cell Death) THE HUMAN PERSPECTIVE: Disorders Associated with G Protein-Coupled Receptors THE HUMAN PERSPECTIVE: Signaling Pathways and Human Longevity

T

he English poet John Donne expressed his belief in the interdependence of humans in the phrase “No man is an island.” The same can be said of the cells that make up a complex multicellular organism. Most cells in a plant or animal are specialized to carry out one or more specific functions. Many biological processes require various cells to work together and to coordinate their activities. To make this possible, cells have to communicate with each other, which is accomplished by a process called cell signaling. Cell signaling makes it possible for cells to respond in an appropriate manner to a specific environmental stimulus. Cell signaling affects virtually every aspect of cell structure and function, which is one of the primary reasons that this chapter appears near the end of the book. On one hand, an understanding of cell signaling requires knowledge about other types of cellular activity. On the other hand, insights into cell signaling can tie together a variety of seemingly independent cellular processes. Cell signaling is also intimately involved in the

Three-dimensional, X-ray crystallographic structure of a signaling complex between a ␤2-adrenergic receptor ( ␤2-AR), which is a representative member of the G protein-coupled receptor (GPCR) superfamily, and a heterotrimeric G protein. The ␤2-AR is shown in green and the three subunits of the G protein are shown in orange, cyan, and purple. The plasma membrane is shown as a grey shadow. GPCRs are integral membrane proteins characterized by their seven transmembrane helices. As a group, these proteins bind an astonishing array of biological messengers, which constitutes the first step in eliciting many of the body’s most basic responses. [CONTINUED ON NEXT PAGE]

618

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

Binding of an agonist (e.g., epinephrine) to the binding pocket in the extracellular domain of the ␤2 AR causes a conformational change in the receptor that causes it to bind the G protein, which transmits the signal into the cell, inducing responses such as increased heart rate and relaxation of smooth muscle cells. Beta-adrenergic receptors are the targets of a number of important drugs, including ␤-blockers, which are widely prescribed for the treatment of high blood pressure and heart arrhythmias. GPCRs had been very difficult to crystallize so that high-resolution structures of these important proteins had been lacking. This situation is now changing as the result of recent advances in crystallization technology. In this case, crystallization of the signaling complex was accomplished using two additional small proteins (not shown in the image) that bound to the ␤2 AR and the G protein to stabilize the complex during the crystallization process. (COURTESY OF SØREN G. F. RASMUSSEN, ANDREW C. KRUSE, AND BRIAN K. KOBILKA.) regulation of cell growth and division. This makes the study of cell signaling crucially important for understanding how a cell can

lose the ability to control cell division and develop into a malignant tumor.

15.1 | The Basic Elements of Cell Signaling Systems

steroids and neurotransmitters ) to small, soluble protein hormones (e.g., glucagon and insulin) to huge glycoproteins bound to the surfaces of other cells. Cells can only respond to a particular extracellular message if they express receptors that specifically recognize and bind that messenger molecule (step 2). The molecule that binds to the receptor is called a ligand. Different types of cells possess different complements of receptors, which allow them to respond to different extracellular messengers. Even cells that share a specific receptor may respond very differently to the same extracellular messenger. Liver cells and smooth muscle cells both possess the ␤2-adrenergic receptor shown in the chapter-opening image on page 617. Activation of this receptor by circulating adrenaline leads to glycogen breakdown in a liver cell and relaxation in a smooth muscle cell. These different outcomes following interaction with the same initial stimulus can be traced to different intracellular proteins that become engaged in the response in these two types of cells. Thus the type of activities in which a cell engages depends both on the stimuli that it receives and the intracellular machinery that it possesses at that particular time in its life. In most cases, the extracellular messenger molecule binds to a receptor at the outer surface of the responding cell. This interaction induces a conformational change in the receptor that causes the signal to be relayed across the membrane to the receptor’s cytoplasmic domain (step 3, Figure 15.2). Once it has reached the inner surface of the plasma membrane, there are two major routes by which the signal is transmitted into the cell interior, where it elicits the appropriate response. The particular route taken depends on the type of receptor that is activated. In the following discussion, we will focus on these two major routes of signal transduction, but keep in mind

It may be helpful to begin the discussion of this complex subject by describing a few of the general features that are shared by most signaling pathways. Cells usually communicate with each other through extracellular messenger molecules. Extracellular messengers can travel a short distance and stimulate cells that are in close proximity to the origin of the message, or they can travel throughout the body, potentially stimulating cells that are far away from the source. In the case of autocrine signaling, the cell that is producing the messenger expresses receptors on its surface that can respond to that messenger (Figure 15.1a). Consequently, cells releasing the message will stimulate (or inhibit) themselves. During paracrine signaling (Figure 15.1b), messenger molecules travel only short distances through the extracellular space to cells that are in close proximity to the cell that is generating the message. Paracrine messenger molecules are usually limited in their ability to travel around the body because they are inherently unstable, or they are degraded by enzymes, or they bind to the extracellular matrix. Finally, during endocrine signaling, messenger molecules reach their target cells via passage through the bloodstream (Figure 15.1c). Endocrine messengers are also called hormones, and they typically act on target cells located at distant sites in the body. An overview of cellular signaling pathways is depicted in Figure 15.2. Cell signaling is initiated with the release of a messenger molecule by a cell that is engaged in sending messages to other cells in the body (step 1, Figure 15.2). The extracellular environments of cells contain hundreds of different informational molecules, ranging from small compounds (e.g.,

(a)

(b)

Figure 15.1 Autocrine (a), paracrine (b), and endocrine (c) types of intercellular signaling.

(c)

619

there are other ways that extracellular signals can have an impact on a cell. For example, we saw on page 169 how neurotransmitters act by opening plasma membrane ion channels and on page 525 how steroid hormones diffuse through the plasma membrane and bind to intracellular receptors. In the two major routes discussed in this chapter: ■

One type of receptor (Section 15.3) transmits a signal from its cytoplasmic domain to a nearby enzyme (step 4), which generates a second messenger (step 5). Because it brings about (effects) the cellular response by generating a second messenger, the enzyme responsible is referred to as an effector. Second messengers are small substances that typically activate (or inactivate) specific proteins. Depending on its chemical structure, a second messenger may diffuse through the cytosol or remain embedded in the lipid bilayer of a membrane.

Signaling cell 1

Transmembrane receptor

2

3

Extracellular signaling molecule (first messenger)

Effector 4

2

■

Another type of receptor (Section 15.4) transmits a signal by transforming its cytoplasmic domain into a recruiting station for cellular signaling proteins (step 4a). Proteins interact with one another, or with components of a cellular membrane, by means of specific types of interaction domains, such as the SH3 domain discussed on page 62.

Whether the signal is transmitted by a second messenger or by protein recruitment, the outcome is similar; a protein that is positioned at the top of an intracellular signaling pathway is activated (step 6, Figure 15.2). Signaling pathways are the information superhighways of the cell. Each signaling pathway consists of a series of distinct proteins that operate in sequence (step 7). Most “signaling proteins” are constructed of multiple domains, which allows them to interact in a dynamic way with a number of different partners, either simultaneously or sequentially. This type of modular construction is illustrated by the Grb2 and IRS-1 proteins depicted in Figures 15.20 and 15.25a, respectively. Unlike Grb2 and IRS-1, which function exclusively in mediating protein-protein interactions, many signaling proteins also contain catalytic and/or regulatory domains that give them a more active role in a signaling pathway. Each protein in a signaling pathway typically acts by altering the conformation of the subsequent (or downstream) protein in the series, an event that activates or inhibits that protein (Figure 15.3). It should come as no surprise, after reading about other topics in cell biology, that alterations in the conformation of signaling proteins are often accomplished by protein

4a

3

Protein kinase 1

P 6

5

6

Second messenger

Protein kinase 2

Active

Protein kinase 2 P

7

Inactive

7

Protein kinase 3

Active

Protein kinase 3 P

Activated target protein

Transcription factor

Transcription Survival Protein synthesis Movement Cell death Metabolic change

Transcription factor P

8

Inactive 9

Active

Active

9

Figure 15.2 An overview of the major signaling pathways by which extracellular messenger molecules can elicit intracellular responses. Two different types of signal transduction pathways are depicted, one in which a signaling pathway is activated by a diffusible second messenger and another in which a signaling pathway is activated by recruitment of proteins to the plasma membrane. Most signal transduction pathways involve a combination of these mechanisms. It should also be noted that signaling pathways are not typically linear tracks as depicted here, but are branched and interconnected to form a complex web. The steps are described in the text.

DNA P

Figure 15.3 Signal transduction pathway consisting of protein kinases and protein phosphatases whose catalytic mRNA actions change the conformations, and thus the activities, of the proteins they modify. In the example depicted here, protein kinase 2 is activated by protein kinase 1. Once activated, protein kinase 2 phosphorylates protein kinase 3, activating the enzyme. Protein kinase 3 then phosphorylates a transcription factor, increasing its affinity for a site on the DNA. Binding of a transcription factor to the DNA affects the transcription of the gene in question. Each of these activation steps in the pathway is reversed by a phosphatase.

15.1 The Basic Elements of Cell Signaling Systems

8

Inactive

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

620

kinases and protein phosphatases that, respectively, add or remove phosphate groups from other proteins (Figure 15.3). The human genome encodes more than 500 different protein kinases and approximately 150 different protein phosphatases. Whereas protein kinases typically work as a single subunit, many protein phosphatases contain a key regulatory subunit that determines substrate specificity. As a result, a single phosphatase catalytic subunit can form a host of different enzymes that remove phosphate groups from different protein substrates. Most protein kinases transfer phosphate groups to serine or threonine residues of their protein substrates, but a very important group of kinases (roughly 90 in humous) phosphorylates tyrosine residues. Some protein kinases and phosphatases are soluble cytoplasmic proteins, others are integral membrane proteins. Many kinases are present in the cell in a self-inhibited state. Depending on the particular kinase, these enzymes can be activated in the cell by covalent modification or by interactions with other proteins, small molecules, or membrane lipids. It is remarkable that, even though thousands of proteins in a cell contain amino acid residues with the potential of being phosphorylated, each protein kinase or phosphatase is able to recognize only its specific substrates and ignore all of the others. Some protein kinases and phosphatases have numerous proteins as their substrates, whereas others phosphorylate or dephosphorylate only a single amino acid residue of a single protein substrate. Many of the protein substrates of these enzymes are enzymes themselves—most often other kinases and phosphatases—but the substrates also include ion channels, transcription factors, and various types of regulatory proteins. It is thought that at least 50 percent of transmembrane and cytoplasmic proteins are phosphorylated at one or more sites. Protein phosphorylation can change protein behavior in several different ways. Phosphorylation can activate or inactivate an enzyme, it can increase or decrease protein–protein interactions, it can induce a protein to move from one subcellular compartment to another, or it can act as a signal that initiates protein degradation. Large-scale proteomic approaches have been employed (page 70) to identify the substrates of various protein kinases and the specific residues that are phosphorylated in various tissues. One recent study of 9 different mouse tissues identified more that 6,000 phosphoproteins harboring nearly 36,000 sites of phosphorylation. The widespread occurrence of protein phosphorylation and its presumed importance is seen in Figure 15.4, which shows the marked difference in the frequency of tyrosine phosphorylation in certain proteins from two different types of cancer cells. The primary challenge is to understand the roles of these diverse posttranslational modifications in the activities of different cell types. Signals transmitted along such signaling pathways ultimately reach target proteins (step 8, Figure 15.2) involved in basic cellular processes (step 9). Depending on the type of cell and message, the response initiated by the target protein may involve a change in gene expression, an alteration of the activity of metabolic enzymes, a reconfiguration of the cytoskeleton, an increase or decrease in cell mobility, a change in ion permeability, activation of DNA synthesis, or even the death of the cell. Virtually every activity in which a cell is engaged is regulated by signals originating at the cell surface. This

Figure 15.4 A comparison in the frequency of tyrosine phosphorylation in two different types of breast cancer cells. The panels on the left side of the figure show the frequency of phosphotyrosine (pTyr) residues in certain proteins (named on the right side of the figure) in triple-negative breast cancer cell lines. Triple-negative cells do not express three major molecular signatures of many breast cancer cells– estrogen receptor, progesterone receptor, and the growth factor receptor HER2. The frequency of pTyr residues in breast cancer cells that express these three signature proteins is shown in the panels on the right. The frequency of pTyr residues in a given protein in a given cell line is indicated by the intensity of the red shading of the box (see the key at the bottom of the figure). The names of the cell lines tested are given along the top of the figure. It is evident that the triple-negative cells have a much greater level of tyrosine phosphorylation than the other breast cancer cells. This may correlate with the loss of a particular protein tyrosine phosphatase activity (PTPN12) in many of the triplenegative cancers. (FROM J. G. ALBECK AND J. S. BRUGGE, FROM DATA BY TING-LEI GU OF CELL SIGNALING TECHNOLOGY, CELL 144:639, 2011. REPRINTED WITH PERMISSION FROM ELSEVIER.)

overall process in which information carried by extracellular messenger molecules is translated into changes that occur inside a cell is referred to as signal transduction. Finally, signaling has to be terminated. This is important because cells have to be responsive to additional messages that

621

they may receive. The first order of business is to eliminate the extracellular messenger molecule. To do this, certain cells produce extracellular enzymes that destroy specific extracellular messengers. In other cases, activated receptors are internalized (page 624). Once inside the cell, the receptor may be degraded together with its ligand, which can leave the cell with decreased sensitivity to subsequent stimuli.

REVIEW

■

1. What is meant by the term signal transduction? What are some of the steps by which signal transduction can occur? 2. What is a second messenger? Why do you suppose it is called this?

15.2 | A Survey of Extracellular Messengers and Their Receptors

■

A large variety of molecules can function as extracellular carriers of information. These include ■

■ ■

■

Extracellular signaling molecules are usually, but not always, recognized by specific receptors that are present on the surface of the responding cell. As illustrated in Figure 15.2, receptors bind their signaling molecules with high affinity and translate this interaction at the outer surface of the cell into changes that take place on the inside of the cell. The receptors that have evolved to mediate signal transduction are indicated below. ■

G protein-coupled receptors (GPCRs) are a huge family of receptors that contain seven transmembrane ␣ helices. These

■

■

15.3 | G Protein-Coupled Receptors and Their Second Messengers G protein-coupled receptors (GPCRs) are so named because they interact with G proteins, as discussed below. Members of the GPCR superfamily are also referred to as seven-

15.3 G Protein-Coupled Receptors and Their Second Messengers

■

Amino acids and amino acid derivatives. Examples include glutamate, glycine, acetylcholine, epinephrine, dopamine, and thyroid hormone. These molecules act as neurotransmitters and hormones. Gases, such as NO and CO. Steroids, which are derived from cholesterol. Steroid hormones regulate sexual differentiation, pregnancy, carbohydrate metabolism, and excretion of sodium and potassium ions. Eicosanoids, which are nonpolar molecules containing 20 carbons that are derived from a fatty acid named arachidonic acid. Eicosanoids regulate a variety of processes including pain, inflammation, blood pressure, and blood clotting. Several over-the-counter drugs that are used to treat headaches and inflammation inhibit eicosanoid synthesis. A wide variety of polypeptides and proteins. Some of these are present as transmembrane proteins on the surface of an interacting cell (page 251). Others are part of, or associate with, the extracellular matrix. Finally, a large number of proteins are excreted into the extracellular environment where they are involved in regulating processes such as cell division, differentiation, the immune response, or cell death and cell survival.

receptors translate the binding of extracellular signaling molecules into the activation of GTP-binding proteins. GTP-binding proteins (or G proteins) were discussed in connection with vesicle budding and fusion in Chapter 8, microtubule dynamics in Chapter 9, protein synthesis in Chapters 8 and 11, and nucleocytoplasmic transport in Chapter 12. In the present chapter, we will explore their role in transmitting messages along “cellular information circuits.” Receptor protein-tyrosine kinases (RTKs) represent a second class of receptors that have evolved to translate the presence of extracellular messenger molecules into changes inside the cell. Binding of a specific extracellular ligand to an RTK usually results in receptor dimerization followed by activation of the receptor’s protein-kinase domain, which is present within its cytoplasmic region. Upon activation, these protein kinases phosphorylate specific tyrosine residues of cytoplasmic substrate proteins, thereby altering their activity, their localization, or their ability to interact with other proteins within the cell. Ligand-gated channels represent a third class of cellsurface receptors that bind to extracellular ligands. The ability of these proteins to conduct a flow of ions across the plasma membrane is regulated directly by ligand binding. A flow of ions across the membrane can result in a temporary change in membrane potential, which will affect the activity of other membrane proteins, for instance, voltage-gated channels. This sequence of events is the basis for formation of a nerve impulse (page 166). In addition, the influx of certain ions, such as Ca2⫹, can change the activity of particular cytoplasmic enzymes. As discussed in Section 4.8, one large group of ligand-gated channels functions as receptors for neurotransmitters. Steroid hormone receptors function as ligand-regulated transcription factors. Steroid hormones diffuse across the plasma membrane and bind to their receptors, which are present in the cytoplasm. Hormone binding results in a conformational change that causes the hormone–receptor complex to move into the nucleus and bind to elements present in the promoters or enhancers of hormone-responsive genes (see Figure 12.47). This interaction gives rise to an increase or decrease in the rate of gene transcription. Finally, there are a number of other types of receptors that act by unique mechanisms. Some of these receptors, for example, the B- and T-cell receptors that are involved in the response to foreign antigens, associate with known signaling molecules such as cytoplasmic protein-tyrosine kinases. We will concentrate in this chapter on the GPCRs and RTKs.

622 Ligand

Opsin ⫹ G␣ peptide (active)

Rhodopsin (inactive)

NH2

Receptor

11-cis retinal

Protein plug

Cell exterior

Effector COOH

γ α

GDP or GTP

β

G protein

TM-VI

Intracellular second messengers

TM-VI

Cell Interior G␣ peptide

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

(a)

Figure 15.5 The membrane-bound machinery for transducing signals by means of a seven transmembrane receptor and a heterotrimeric G protein. (a) Receptors of this type, including those that bind epinephrine and glucagon, contain seven membranespanning helices. When bound to their ligand, the receptor interacts with a trimeric G protein, which activates an effector, such as adenylyl cyclase. As indicated in the figure, the ␣ and ␥ subunits of the G protein are linked to the membrane by lipid groups that are embedded in the lipid bilayer. (Note: Many GPCRs may be active as complexes of two or more receptor molecules.) (b) A model depicting the activation of the GPCR rhodopsin based on X-ray crystallographic structures. On the left, rhodopsin is shown in its inactive (dark-adapted) conformation together with an unbound heterotrimeric G protein (called transducin). When the retinal cofactor (shown in red on the left rhodopsin molecule) absorbs a photon, it undergoes an isomerization reaction (from a cis to a trans form), which leads to the disruption of an ionic

transmembrane (7TM) receptors because they contain seven transmembrane helices (Figure 15.5). Thousands of different GPCRs have been identified in organisms ranging from yeast to flowering plants and mammals that together regulate an extraordinary spectrum of cellular processes. In fact, GPCRs constitute the single largest superfamily of proteins encoded by animal genomes. Included among the natural ligands that bind to GPCRs are a diverse array of hormones (both plant and animal), neurotransmitters, opium derivatives, chemoattractants (e.g., molecules that attract phagocytic cells of the immune system), odorants and tastants (molecules detected by olfactory and gustatory receptors eliciting the senses of smell and taste), and photons. A list

G protein

(b)

linkage between residues on the third and sixth transmembrane helix of the protein. This event in turn leads to a change in conformation of the protein, including the outward tilt and rotation of the sixth transmembrane helix (red curved arrow), which exposes a binding site for the G␣ subunit of the G protein. The rhodopsin molecule on the right is shown in the proposed active conformation with a portion of the G␣ subunit (in red) bound to the receptor’s cytoplasmic face. (B: FROM THUE W. SCHWARTZ AND WAYNE L. HUBBELL, NATURE 455, 473, 2008. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

of some of the ligands that operate by means of this pathway and the effectors through which they act is provided in Table 15.1.

Signal Transduction by G Protein-Coupled Receptors Receptors G protein-coupled receptors normally have the following topology. Their amino-terminus is present on the outside of the cell, the seven ␣ helices that traverse the plasma membrane are connected by loops of varying length, and the carboxyl-terminus is present on the inside of the cell (Figure 15.5). There are three loops present on the outside of the cell

Table 15.1 Examples of Physiologic Processes Mediated by GPCRs and Heterotrimeric G Proteins Stimulus

Receptor

Effector

Physiologic response

Epinephrine Serotonin Light IgE–antigen complexes f-Met Peptide Acetylcholine

␤-Adrenergic receptor Serotonin receptor Rhodopsin Mast cell IgE receptor Chemotactic receptor Muscarinic receptor

Adenylyl cyclase Adenylyl cyclase cGMP phosphodiesterase Phospholipase C Phospholipase C Potassium channel

Glycogen breakdown Behavioral sensitization and learning in Aplysia Visual excitation Secretion Chemotaxis Slowing of pacemaker activity

Adapted from L. Stryer and H. R Bourne, reproduced with permission from the Annual Review of Cell Biology, Vol 2, copyright 1986, by Annual Reviews Inc. Annual Review of Cell Biology by Annual Reviews, Inc. Reproduced with permission of Annual Reviews, in the format Republish in a book via Copyright Clearance Center.

623 Ligand

Receptor 1

G protein GDP

α

γ

α

β

GDP

β

Effector 2

α

γ β

GDP

GTP

3

γ

α

β GTP

4

γ

α

β GTP

ATP

cAMP

5

γ

α

β

GDP + Pi

6

α

Figure 15.6 The mechanism of receptor-mediated activation (or inhibition) of effectors by means of heterotrimeric G proteins. In step 1, the ligand binds to the receptor, altering its conformation and increasing its affinity for the G protein to which it binds. In step 2, the G␣ subunit releases its GDP, which is replaced by GTP. In step 3, the G␣ subunit dissociates from the G␤␥ complex and binds to an effector (in this case adenylyl cyclase), activating the effector. The G␤␥ dimer may also bind to an effector (not shown), such as an ion channel or an enzyme. In step 4, activated adenylyl cyclase produces cAMP. In step 5, the GTPase activity of G␣ hydrolyzes the bound GTP, deactivating G␣. In step 6, G␣ reassociates with G␤␥, reforming the trimeric G protein, and the effector ceases its activity. In step 7, the receptor has been phosphorylated by a GRK and in step 8 the phosphorylated receptor has been bound by an arrestin molecule, which inhibits the ligandbound receptor from activating additional G proteins. The receptor bound to arrestin is likely to be taken up by endocytosis.

γ

γ β

GDP

7

GRK

P P

ATP

α

γ β

GDP

ADP 8

P P Arrestin

α GDP

γ β

15.3 G Protein-Coupled Receptors and Their Second Messengers

that, together, form the ligand-binding pocket, whose structure varies among different GPCRs. There are also three loops present on the cytoplasmic side of the plasma membrane that provide binding sites for intracellular signaling proteins. It is very difficult, for a number of technical reasons, to prepare crystals of unmodified GPCRs that are suitable for X-ray crystallographic analysis. For a number of years, rhodopsin was the only member of the superfamily to have its X-ray crystal structure determined. Rhodopsin has an unusually stable structure for a GPCR, owing to the fact that its ligand (a retinal group) is permanently bound to the protein and the protein molecule can only exist in a single inactive conformation in the absence of a stimulus (i.e., in the dark). Beginning in 2007, as a result of years of effort by a number of research groups, a flurry of GPCR crystal structures appeared in the literature. For the most part, these structures revealed the GPCR in the inactive state, but more recent reports have described the structures of modified versions of several GPCRs in the active state. There is also a body of structural and spectroscopic data (of the type described on page 135) that provide insights into some of the conformational changes that occur as a GPCR is activated and becomes bound to a G protein. The first X-ray crystal structure of an active GPCR with its bound G protein was solved by Brian Kobilka and colleagues at Stanford University and is shown in the chapter-opening image on page 617. The inactive conformation of the GPCR is stabilized by noncovalent interactions between specific residues in the transmembrane ␣ helices. Ligand binding disturbs these interactions, thereby causing the receptor to assume an active conformation. This requires rotations and shifts of the transmembrane ␣ helices relative to each other. Because they are attached to the cytoplasmic loops, rotation or movement of these transmembrane ␣ helices causes changes in the conformation of the cytoplasmic loops. This in turn leads to an increase in the affinity of the receptor for a G protein that is present on the cytoplasmic surface of the plasma membrane (Figure 15.5b). As a consequence, the ligand-bound receptor forms a receptor–G protein complex (Figure 15.6, step 1). The interaction with the receptor induces a conformational change in the ␣ subunit of a G protein, causing the release of GDP, which is followed by binding of GTP (step 2). While in the activated state, a single receptor can

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

624

activate a number of G protein molecules, providing a means of signal amplification (discussed further on page 632).

couple to a number of different types of effectors, including PLC␤, K⫹ and Ca2⫹ ion channels, and adenylyl cyclase.

G Proteins Heterotrimeric G proteins were discovered, purified, and characterized by Martin Rodbell and his colleagues at the National Institutes of Health and Alfred Gilman and colleagues at the University of Virginia. Their studies are discussed in the Experimental Pathways, which can be found on the Web at www.wiley.com/college/karp. These proteins are referred to as G proteins because they bind guanine nucleotides, either GDP or GTP. They are described as heterotrimeric because all of them consist of three different polypeptide subunits, called ␣, ␤, and ␥. This property distinguishes them from small, monomeric G proteins, such as Ras, which are discussed later in this chapter. Heterotrimeric G proteins are held at the plasma membrane by lipid chains that are covalently attached to the ␣ and ␥ subunits (Figure 15.5a). The guanine nucleotide-binding site is present on the G␣ subunit. Replacement of GDP by GTP, following interaction with an activated GPCR, results in a conformational change in the G␣ subunit. In its GTP-bound conformation, the G␣ subunit has a low affinity for G␤␥, leading to dissociation of the trimeric complex. Each dissociated G␣ subunit (with GTP attached) is free to activate an effector protein, such as adenylyl cyclase (Figure 15.6, step 3). In this case, activation of the effector leads to the production of the second messenger cAMP (step 4). Other effectors include phospholipase C-␤ and cyclic GMP phosphodiesterase (see below). Second messengers, in turn, activate one or more cellular signaling proteins. A G protein is said to be “on” when its ␣ subunit is bound to GTP. G␣ subunits can turn themselves off by hydrolysis of GTP to GDP and inorganic phosphate (Pi) (Figure 15.6, step 5). This results in a conformational change causing a decrease in the affinity for the effector and an increase in the affinity for the ␤␥ subunit. Thus, following hydrolysis of GTP, the G␣ subunit will dissociate from the effector and reassociate with the ␤␥ subunit to reform the inactive heterotrimeric G protein (step 6). In a sense, heterotrimeric G proteins function as molecular timers. They are turned on by the interaction with an activated receptor and turn themselves off by hydrolysis of bound GTP after a certain amount of time has passed. While they are active, G␣ subunits can turn on downstream effectors. Heterotrimeric G proteins come in four flavors, Gs, Gq, Gi, and G12/13. This classification is based on the G␣ subunits and the effectors to which they couple. The particular response elicited by an activated GPCR depends on the type of G protein with which it interacts, although some GPCRs can interact with different G proteins and trigger more than one physiologic response. Gs family members couple receptors to adenylyl cyclase. Adenylyl cyclase is activated by GTP-bound Gs subunits. Gq family members contain G␣ subunits that activate PLC␤. PLC␤ hydrolyzes phosphatidylinositol bisphosphate, producing inositol trisphosphate and diacylglycerol (page 630). Activated Gi subunits function by inhibiting adenylyl cyclase. G12/13 members are less well characterized than the other G protein families although their inappropriate activation has been associated with excessive cell proliferation and malignant transformation. Following its dissociation from the G␣ subunit, the ␤␥ complex also has a signaling function and it can

Termination of the Response We have seen that ligand binding results in receptor activation. The activated receptors turn on G proteins, and G proteins turn on effectors. To prevent overstimulation, receptors have to be blocked from continuing to activate G proteins. To regain sensitivity to future stimuli, the receptor, the G protein, and the effector must all be returned to their inactive state. Desensitization, the process that blocks active receptors from turning on additional G proteins, takes place in two steps. In the first step, the cytoplasmic domain of the activated GPCR is phosphorylated by a specific type of kinase, called G protein-coupled receptor kinase (GRK ) (Figure 15.6, step 7). GRKs form a small family of serine-threonine protein kinases that specifically recognize activated GPCRs. Phosphorylation of the GPCR sets the stage for the second step, which is the binding of proteins, called arrestins (Figure 15.6, step 8). Arrestins form a small family of proteins that bind to GPCRs and compete for binding with heterotrimeric G proteins. As a consequence, arrestin binding prevents the further activation of additional G proteins. This action is termed desensitization because the cell stops responding to the stimulus, while that stimulus is still acting on the outer surface of the cell. Desensitization is one of the mechanisms that allows a cell to respond to a change in its environment, rather than continuing to “fire” endlessly in the presence of an unchanging environment. The importance of desensitization is illustrated by the observation that mutations that interfere with phosphorylation of rhodopsin by a GRK lead to the death of the photoreceptor cells in the retina. This type of retinal cell death is thought to be one of the causes of blindness resulting from the disease retinitis pigmentosa. Arrestins can be described as protein hubs (page 62), in that they are capable of binding to a variety of different proteins involved in different intracellular processes. While they are bound to phosphorylated GPCRs (step 1, Figure 15.7), arrestin molecules are also capable of binding to AP2 adaptor molecules that are situated in clathrin-coated pits (page 310). The interaction between bound arrestin and clathrin-coated pits (step 2) promotes the uptake of phosphorylated GPCRs into the cell by endocytosis. Depending on the circumstances, receptors that have been removed from the surface by endocytosis can participate in several alternative outcomes. In some cases, receptors travel along the endocytic pathway into endosomes (page 312), where the associated arrestin molecules serve as a scaffold for the assembly of various cytoplasmic signaling complexes. The MAPK pathway, which is discussed at length later in the chapter, is thought to be activated by arrestinbound GPCRs localized within endosomes (step 3). The discovery of “signaling endosomes” came as a surprise to researchers in the field of cell signaling who had been working under the assumption that GPCRs (and RTKs as well) were only capable of signal transduction when they resided at the cell surface. Now it appears that the signals transmitted from endosomes have different properties and physiological roles from those that arise from the plasma membrane. In a second outcome, internalized receptors may traffic from endosomes to lysosomes where they are degraded (step 4). If receptors are

625

Arrestin

6

1

AP2

P P Clathrin

Recycling endosome

P P

2

5

Endosome 4

P P

Lysosome

ERK

3

Other pathways

Figure 15.7 Arrestin-mediated internalization of GPCRs. Arrestinbound GPCRs (step 1) are internalized when they are trapped in clathrin-coated pits which bud into the cytoplasm (step 2). As discussed in Section 8.8, clathrin-coated buds are transformed into clathrin-coated vesicles which deliver their contents, including the GPCRs, to endosomes. When present in the endosomes, arrestins can serve as scaffolds for the assembly of signaling complexes, including those that activate the MAPK cascade and the transcription factor ERK (step 3). Alternatively, the GPCRs can be delivered to lysosomes, where they are degraded (step 4), or they can be returned to the plasma membrane in a recycling endosome (step 5), where they can then interact with new extracellular ligands (step 6). (FROM S. L. RITTER AND R. A. HALL, NATURE REVIEWS MCB 10:820, 2009, BOX 1B. NATURE REVIEWS MOLECULAR CELL BIOLOGY BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

H U M A N

P E R S P E C T I V E

Disorders Associated with G Protein-Coupled Receptors GPCRs represent the largest family of genes encoded by the human genome. Their importance in human biology is reflected by the fact that over one-third of all prescription drugs act as ligands that bind to this huge superfamily of receptors. A number of inherited disorders have been traced to defects in both GPCRs (Figure 1) and heterotrimeric G proteins (Table 1). Retinitis pigmentosa (RP) is an inherited disease characterized by progressive degeneration of the retina and eventual blindness. RP can be caused by mutations in the

Figure 1 Two-dimensional representation of a “composite” transmembrane receptor showing the approximate sites of a number of mutations responsible for causing human diseases. Most of the mutations (numbers 1, 2, 5, 6, 7, and 8) result in constitutive stimulation of the effector, but others (3 and 4) result in blockage of the receptor’s ability to stimulate the effector. Mutations at sites 1 and 2 are found in the MSH (melanocyte-stimulating hormone) receptor; 3 in the ACTH (adrenocorticotrophic hormone) receptor; 4 in the vasopressin receptor; 5 and 6 in the TSH (thyroid-stimulating hormone) receptor; 7 in the LH (luteinizing hormone) receptor; and 8 in rhodopsin, the light-sensitive pigment of the retina.

gene that encodes rhodopsin, the visual pigment of the rods. Many of these mutations lead to premature termination or improper folding of the rhodopsin protein and its elimination from the cell before it reaches the plasma membrane (page 288). Other mutations may lead

NH2 Extracellular

2

8 7

3 1

Intracellular

4

6 5

COOH

15.3 G Protein-Coupled Receptors and Their Second Messengers

T H E

degraded, the cells lose, at least temporarily, sensitivity for the ligand in question. Finally, according to a third scheme, the arrestin-bound GPCRs may be dephosphorylated and returned to the plasma membrane (step 5). If receptors are returned to the cell surface, the cells remain sensitive to the ligand (they are said to be resensitized ). Signaling by the activated G␣ subunit is terminated by a less complex mechanism: the bound GTP molecule is simply hydrolyzed to GDP (step 5, Figure 15.6). Thus, the strength and duration of the signal are determined in part by the rate of GTP hydrolysis by the G␣ subunit. G␣ subunits possess a weak GTPase activity, which allows them to slowly hydrolyze the bound GTP and inactivate themselves. Termination of the response is accelerated by regulators of G protein signaling (RGSs). The interaction with an RGS protein increases the rate of GTP hydrolysis by the G␣ subunit. Once the GTP is hydrolyzed, the G␣-GDP reassociates with the G␤␥ subunits to reform the inactive trimeric complex (step 6) as discussed above. This returns the system to the resting state. The mechanism for transmitting signals across the plasma membrane by G proteins is of ancient evolutionary origin and is highly conserved. This is illustrated by an experiment in which yeast cells were genetically engineered to express a receptor for the mammalian hormone somatostatin. When these yeast cells were treated with somatostatin, the mammalian receptors at the cell surface interacted with the yeast heterotrimeric G proteins at the inner surface of the membrane and triggered a response leading to proliferation of the yeast cells. The effects of certain mutations on the function of G protein-coupled receptors are discussed in the accompanying Human Perspective.

626

Table 1 Human Diseases Linked to the G Protein Pathway Disease

Albright’s hereditary osteodystrophy and pseudohypoparathyroidisms McCune–Albright syndrome Pituitary, thyroid tumors (gsp oncogene) Adrenocortical, ovarian tumors (gip oncogene) Combined precocious puberty and peudohypoparathyroidism Disease

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

Familial hypocalciuric hypercalcemia Neonatal severe hyperparathyroidism Hyperthyroidism (thyroid adenomas) Familial male precocious puberty X-linked nephrogenic diabetes insipidus Retinitis pigmentosa Color blindness, spectral sensitivity variations Familial glucocorticoid deficiency and isolated glucocorticoid deficiency

Defective G protein*

Gs␣ Gs␣ Gs␣ Gi␣ Gs␣ Defective G proteincoupled receptor

Human analogue of BoPCAR1 receptor Human analogue of BoPCAR1 receptor (homozygous) Thyrotropin receptor Luteinizing hormone receptor V2 vasopressin receptor Rhodopsin receptor Cone opsin receptor Adrenocorticoptropic hormone (ACTH) receptor

*As described in the text, a G protein with a Gs␣ acts to stimulate the effector, whereas a G protein with a Gi␣ inhibits the effector. Source: D. E. Clapham, reprinted with permission from Nature, Vol 371, p. 109, 1994, copyright 1994. Nature by Nature Publishing Group. Reproduced with permission of Nature Publishing Group in the format reuse in a book/textbook via Copyright Clearance Center.

to the synthesis of a rhodopsin molecule that cannot activate its G protein and thus cannot pass the signal downstream to the effector. RP results from a mutation that leads to a loss of function of the encoded receptor. Many mutations that alter the structure of signaling proteins can have an opposite effect, leading to what is described as a “gain of function.” In one such case, mutations have been found to cause a type of benign thyroid tumor, called an adenoma. Unlike normal thyroid cells that secrete thyroid hormone only in response to stimulation by the pituitary hormone TSH, the cells of these thyroid adenomas secrete large quantities of thyroid hormone without having to be stimulated by TSH (the receptor is said to act constitutively). The TSH receptor in these cells contains an amino acid substitution that affects the structure of the third intracellular loop of the protein (Figure 1, mutations at sites 5 or 6). As a result of the mutation, the TSH receptor constitutively activates a G protein on its inner surface, sending a continual signal through the pathway that leads not only to excessive thyroid hormone secretion but to the excessive cell proliferation that causes the tumor. This conclusion was verified by introducing the mutant gene into cultured cells that normally lack this receptor and demonstrating that the synthesis of the mutant protein and its incorporation

into the plasma membrane led to the continuous production of cAMP in the genetically engineered cells. The mutation that causes thyroid adenomas is not found in the normal portion of a patient’s thyroid but only in the tumor tissue, indicating that the mutation was not inherited but arose in one of the cells of the thyroid, which then proliferated to give rise to the tumor. A mutation in a cell of the body, such as a thyroid cell, is called a somatic mutation to distinguish it from an inherited mutation that would be present in all of the individual’s cells. As will be evident in the following chapter, somatic mutations are a primary cause of human cancer. At least one cancer-causing virus has been shown to encode a protein that acts as a constitutively active GPCR. The virus is a type of herpes virus that is responsible for Kaposi’s sarcoma, which causes purplish skin lesions and is prevalent in AIDS patients. The virus genome encodes a constitutively active receptor for interleukin-8, which stimulates signaling pathways that control cell proliferation. As noted in Table 1, mutations in genes that encode the subunits of heterotrimeric G proteins can also lead to inherited disorders. This is illustrated by a report on two male patients suffering from a rare combination of endocrine disorders: precocious puberty and hypoparathyroidism. Both patients were found to contain a single amino acid substitution in one of the G␣ isoforms. The alteration in amino acid sequence caused two effects on the mutant G protein. At temperatures below normal body temperature, the mutant G protein remained in the active state, even in the absence of a bound ligand. In contrast, at normal body temperatures, the mutant G protein was inactive, both in the presence and absence of bound ligand. The testes, which are housed outside of the body’s core, have a lower temperature than the body’s visceral organs (33⬚C vs. 37⬚C). Normally, the endocrine cells of the testes initiate testosterone production at the time of puberty in response to the pituitary hormone LH, which begins to be produced at that time. The circulating LH binds to LH receptors on the surface of the testicular cells, inducing the synthesis of cAMP and subsequent production of the male sex hormone. The testicular cells of the patients bearing the G protein mutation were stimulated to synthesize cAMP in the absence of the LH ligand, leading to premature synthesis of testosterone and precocious puberty. In contrast, the mutation in this same G␣ subunit in the cells of the parathyroid glands, which function at a temperature of 37⬚C, caused the G protein to remain inactive. As a result, the cells of the parathyroid gland could not respond to stimuli that would normally cause them to secrete parathyroid hormone, leading to the condition of hypoparathyroidism. The fact that most of the bodily organs functioned in a normal manner in these patients suggests that this particular G␣ isoform is not essential in the activities of most other cells. Mutations are thought of as rare and disabling changes in the nucleotide sequence of a gene. Genetic polymorphisms, in contrast, are thought of as common, “normal” variations within the population (page 416). Yet it has become clear in recent years that genetic polymorphisms may have considerable impact on human disease, causing certain individuals to be more or less susceptible to particular disorders than other individuals. This has been well documented in the case of GPCRs. For example, certain alleles of the gene encoding the ␤2 adrenergic receptor have been associated with an increased likelihood of developing asthma or high blood pressure; certain alleles of a dopamine receptor are associated with increased risk of substance abuse or schizophrenia; and certain alleles of a chemokine receptor (CCR5) are associated with prolonged survival in HIV-infected individuals. As discussed on page 417, identifying associations between disease susceptibility and genetic polymorphisms is a current focus of clinical research.

627

Bacterial Toxins Because G proteins are so important to the normal physiology of multicellular organisms, they provide excellent targets for bacterial pathogens. For example, cholera toxin (produced by the bacterium Vibrio cholerae) exerts its effect by modifying G␣ subunits and inhibiting their GTPase activity in the cells of the intestinal epithelium. As a result, adenylyl cyclase molecules remain in an activated mode, churning out cAMP, which causes the epithelial cells to secrete large volumes of fluid into the intestinal lumen. The loss of water associated with this inappropriate response often leads to death due to dehydration. Pertussis toxin is one of several virulence factors produced by Bordetella pertussis, a microorganism that causes whooping cough. Whooping cough is a debilitating respiratory tract infection seen in 50 million people worldwide each year, causing death in about 350,000 of these cases annually. Pertussis toxin also inactivates G␣ subunits, thereby interfering with the signaling pathway that leads the host to mount a defensive response against the bacterial infection.

Second Messengers

Figure 15.8 The localized formation of cAMP in a live cell in response to the addition of an extracellular messenger molecule. This series of photographs shows a sensory nerve cell from the sea hare Aplysia. The concentration of free cAMP is indicated by the color: blue represents a low cAMP concentration, yellow an intermediate concentration, and red a high concentration. The left image shows the intracellular cAMP level in the unstimulated neuron, and the next three images show the effects of stimulation by the neurotransmitter serotonin (5-hydroxytryptamine) at the times

Phosphatidylinositol-Derived Second Messengers It wasn’t very long ago that the phospholipids of cell membranes were considered strictly as structural molecules that made membranes cohesive and impermeable to aqueous solutes. Our appreciation of phospholipids has increased with the realization that these molecules form the precursors of a number of second messengers. Phospholipids of cell membranes are

indicated. Notice that the cAMP levels drop by 109s despite the continued presence of the neurotransmitter. (The cAMP level was determined indirectly in this experiment by microinjection of a fluorescently labeled cAMP-dependent protein kinase labeled with both fluorescein and rhodamine on different subunits. Energy transfer between the subunits (see page 738) provides a measure of cAMP concentration.) (FROM BRIAN J. BACSKAI ET AL., SCIENCE 260:223, 1993. REPRINTED WITH PERMISSION FROM AAAS.)

15.3 G Protein-Coupled Receptors and Their Second Messengers

The Discovery of Cyclic AMP, a Prototypical Second Messenger How does the binding of a hormone to the plasma membrane change the activity of cytoplasmic enzymes, such as glycogen phosphorylase, an enzyme involved in glycogen metabolism? The answer to this question was provided by studies that began in the mid-1950s in the laboratories of Earl Sutherland and his colleagues at Case Western Reserve University. Sutherland’s goal was to develop an in vitro system to study the physiologic responses to hormones. After considerable effort, he was able to activate glycogen phosphorylase in a preparation of broken cells that had been incubated with glucagon or epinephrine. This broken-cell preparation could be divided by centrifugation into a particulate fraction consisting primarily of cell membranes and a soluble supernatant fraction. Even though

glycogen phosphorylase was present only in the supernatant fraction, the particulate material was required to obtain the hormone response. Subsequent experiments indicated that the response occurred in at least two distinct steps. If the particulate fraction of a liver homogenate was isolated and incubated with the hormone, some substance was released that, when added to the supernatant fraction, activated the soluble glycogen phosphorylase molecules. Sutherland identified the substance released by the membranes of the particulate fraction as cyclic adenosine monophosphate (cyclic AMP, or simply cAMP). This discovery is heralded as the beginning of the study of signal transduction. As will be discussed below, cAMP stimulates glucose mobilization by activating a protein kinase that adds a phosphate group onto a specific serine residue of the glycogen phosphorylase polypeptide. Cyclic AMP is a second messenger that is capable of diffusing to other sites within the cell. The synthesis of cyclic AMP follows the binding of a first messenger—a hormone or other ligand—to a receptor at the outer surface of the cell. Figure 15.8 shows the diffusion of cyclic AMP within the cytoplasm of a neuron following stimulation by an extracellular messenger molecule. Whereas the first messenger binds exclusively to a single receptor species, the second messenger often stimulates a variety of cellular activities. As a result, second messengers enable cells to mount a large-scale, coordinated response following stimulation by a single extracellular ligand. Other second messengers include Ca2⫹, phosphoinositides, inositol trisphosphate, diacylglycerol, cGMP, and nitric oxide.

628 PH Domain

Glycerol

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

(a)

(b)

(c)

Figure 15.9 Phospholipid-based second messengers. (a) The structure of a generalized phospholipid (see Figure 2.22). Phospholipids are subject to attack by four types of phospholipases that cleave the molecule at the indicated sites. Of these enzymes, we will focus on PLC, which splits the phosphorylated head group from the diacylglycerol (see Figure 15.10). (b) A model showing the interaction between a portion of a PLC enzyme molecule containing a PH domain that binds to the phosphorylated inositol ring of a phosphoinositide. This interaction holds the enzyme to the inner surface of the plasma membrane and may alter its enzymatic activity. (c) Fluorescence

micrograph of a cell that has been stimulated to move toward a chemoattractant (i.e., a chemical to which the cell is attracted). This cell has been stained with an antibody that binds specifically to PI 3,4,5-trisphosphate (PIP3), which is seen to be localized at the leading edge of the migrating cell (arrows). Bar equals 15 ␮m. (B: FROM JAMES H. HURLEY AND JAY A. GROBLER, CURR. OPIN. STRUCT. BIOL. 7:559, 1997; C: FROM PAULA RICKERT ET AL., COURTESY OF HENRY R. BOURNE, UNIVERSITY OF CALIFORNIA, SAN FRANCESCO, TRENDS CELL BIOL. 10:470, 2000. BOTH IMAGES REPRINTED WITH PERMISSION FROM ELSEVIER.)

converted into second messengers by a variety of enzymes that are regulated in response to extracellular signals. These enzymes include phospholipases (lipid-splitting enzymes), phospholipid kinases (lipid-phosphorylating enzymes), and phospholipid phosphatases (lipid-dephosphorylating enzymes). Phospholipases are enzymes that hydrolyze specific ester bonds that connect the different building blocks that make up a phospholipid molecule. Figure 15.9a shows the cleavage sites within a generalized phospholipid that are attacked by the main classes of phospholipases. All four of the enzyme classes depicted in Figure 15.9a can be activated in response to extracellular signals and the products they produce function as second messengers. In this section, we will focus on the best-studied lipid second messengers, which are derived from phosphatidylinositol, and are generated following the transmission of signals by G protein-coupled receptors and receptor protein-tyrosine kinases. Another group of lipid second messengers derived from sphingomyelin is not discussed.

The first indication that phospholipids might be involved in cellular responses to extracellular signals emerged from studies carried out in the early 1950s by Lowell and Mabel Hokin of Montreal General Hospital and McGill University. These investigators had set out to study the effects of acetylcholine on RNA synthesis in the pancreas. To carry out these studies, they incubated slices of pigeon pancreas in [32P]orthophosphate. The idea was that [32P]orthophosphate would be incorporated into nucleoside triphosphates, which are used as precursors during the synthesis of RNA. Interestingly, they found that treatment of the tissue with acetylcholine led to the incorporation of radioactivity into the phospholipid fraction of the cell. Further analysis revealed that the isotope was incorporated primarily into phosphatidylinositol (PI), which was rapidly converted to other phosphorylated derivatives, which are collectively referred to as phosphoinositides. This suggested that inositol-containing lipids can be phosphorylated by specific lipid kinases that are activated in response to extracellular messenger molecules, such as acetylcholine. It is now well established that lipid kinases are activated in response to a large variety of extracellular signals. Several of the reactions of phosphoinositide metabolism are shown in Figure 15.10. As indicated on the left side of this figure, the inositol ring, which resides at the cytoplasmic surface of the bilayer, has six carbon atoms. Carbon number 1 is involved in the linkage between inositol and diacylglycerol. The 3, 4, or 5 carbons can be phosphorylated by specific phosphoinositide kinases present in cells to generate 7 distinct phosphoinositides. For example, transfer of a single phosphate group to the 4-position of the inositol sugar of PI by PI 4kinase (PI4K) generates PI 4-phosphate (PI(4)P), which can

Phosphatidylinositol Phosphorylation When the neurotransmitter acetylcholine binds to the surface of a smooth muscle cell within the wall of the stomach, the muscle cell is stimulated to contract. When a foreign antigen binds to the surface of a mast cell, the cell is stimulated to secrete histamine, a substance that can trigger the symptoms of an allergy attack. Both of these responses, one leading to contraction and the other to secretion, are triggered by the same second messenger, a substance derived from the compound phosphatidyl– inositol, a minor component of most cellular membranes (see Figure 4.10).

629 Ligand

1

PI

PI(4)P

Kinase

2

PI(4,5)P2

G protein

Kinase

5

R γ

HO P 2

HO

6

1 3

4

OH

5

β

HO P OH OH HO

HO P OH OH HO P

DAG

PI(4,5)P2

Inner leaflet α

3

OH

4

OH P P

P GTP

PKC

HO P HO

P

6

PI-PLCβ

7

HO P

GDP

OH

HO 8

P P

IP3

9

IP3 receptor

Ca2+ Smooth ER

which PI(4, 5)P2 is split into diacylglycerol (DAG) and inositol 1,4,5trisphosphate (IP3) (step 5). DAG recruits the protein kinase PKC to the membrane and activates the enzyme (step 6). IP3 diffuses into the cytosol (step 7), where it binds to an IP3 receptor and Ca2⫹ channel in the membrane of the SER (step 8). Binding of IP3 to its receptor causes release of calcium ions into the cytosol (step 9).

be phosphorylated by PIP 5-kinase (PIP5K) to form PI 4,5bisphosphate (PI(4,5)P2; Figure 15.10, steps 1 and 2). PI(4,5)P2 can be phosphorylated by PI 3-kinase (PI3K) to form PI(3,4,5)P3 (PIP3) (shown in Figure 15.25c). The phosphorylation of PI(4,5)P2 to form PIP3 is of particular interest because the PI3K enzymes involved in this process can be controlled by a large variety of extracellular molecules and PI3K overactivity has been associated with human cancers. The formation of PIP3 during the response to insulin is discussed on page 645. All of the phospholipid species discussed above remain in the cytoplasmic leaflet of the plasma membrane; they are membrane-bound second messengers. Just as there are lipid kinases to add phosphate groups to phosphoinositides, there are lipid phosphatases (e.g., PTEN) to remove them. The activity of these kinases and phosphatases are coordinated so that specific phosphoinositides appear at specific membrane compartments at specific times after a signal has been received. The role of specific phosphoinositides in membrane trafficking was discussed on page 311. The phosphorylated inositol rings of phosphoinositides form binding sites for several lipid-binding domains (PH, PX, and FYVE) found in proteins. Best known is the PH domain (Figure 15.9b), which has been identified in over 150 different proteins. Binding of a protein by its PH domain to PI(3, 4)P2

or PIP3 typically recruits the protein to the cytoplasmic face of the plasma membrane where it can interact with other membrane-bound proteins, including activators, inhibitors, or substrates. Figure 15.9c shows an example where PIP3 is specifically localized to a particular portion of the plasma membrane of a cell. PIP3 is produced at the front of the cell by a localized lipid kinase and is subsequently degraded at the rear and sides of the cell by a localized lipid phosphatase. The cell shown in Figure 15.9c is engaged in chemotaxis, which is to say that it is moving toward an increasing concentration of a particular chemical in the medium that serves as the chemoattractant. This is the mechanism that causes phagocytic cells such as macrophages to move toward bacteria or other targets that they engulf. Chemotaxis depends on the localized production of phosphoinositide messengers, which bind to certain actin-binding proteins (page 372) to influence the formation of actin filaments and lamellipodia that are required to move the cell in the direction of the target. Phospholipase C Not all inositol-containing second messengers remain in the lipid bilayer of a membrane. When acetylcholine binds to a smooth muscle cell, or an antigen binds to a mast cell, the bound receptor activates a heterotrimeric G protein (Figure 15.10, step 3), which, in turn, activates the effector phosphatidylinositol-specific phospholipase

15.3 G Protein-Coupled Receptors and Their Second Messengers

Figure 15.10 The generation of second messengers as a result of ligand-induced breakdown of phosphoinositides (PI) in the lipid bilayer. In steps 1 and 2, phosphate groups are added by lipid kinases to phosphatidylinositol (PI) to form PIP2. When a stimulus is received by a receptor, the ligand-bound receptor activates a heterotrimeric G protein containing a G␣q subunit (step 3), which activates the enzyme PI-specific phospholipase C-␤(step 4), which catalyzes the reaction in

630

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

Diacylglycerol Diacylglycerol (Figure 15.10) is a lipid molecule that remains in the plasma membrane following its formation by PLC␤. There it recruits and activates effector proteins that bear a DAG-binding C1 domain. The beststudied of these effectors is a family of proteins called protein kinase C (PKC) (step 6, Figure 15.10), which phosphorylate serine and threonine residues on a wide variety of target proteins. Protein kinase C isoforms have a number of important roles in cellular growth and differentiation, cellular metabolism, cell death, and immune responses. The apparent importance of protein kinase C in growth control is seen in studies with a group of powerful plant compounds, called phorbol esters, that resemble DAG. These compounds activate protein kinase C isoforms in a variety of cultured cells, causing them to lose growth control and behave temporarily as malignant cells. When the phorbol ester is removed from the medium, the cells recover their normal growth properties. In contrast, cells that have been genetically engineered to constitutively express protein kinase C exhibit a permanent malignant phenotype in cell culture and can cause tumors in susceptible mice. The importance of PKC in the development of immune responses is seen in studies of a specific PKC inhibitor called AEB071, which is being tested in clinical trials as an immunosuppressant to prevent rejection of transplanted kidneys and to treat psoriasis, an autoimmune skin disease. Inositol 1,4,5-trisphosphate (IP3 ) Inositol 1,4,5-trisphosphate (IP3) is a sugar phosphate—a small, water-soluble molecule capable of rapid diffusion throughout the interior of the cell. IP3 molecules formed at the membrane diffuse into the cytosol (step 7, Figure 15.10) and bind to a specific IP3 receptor located at the surface of the smooth endoplasmic reticulum (step 8). It was noted on page 280, that the smooth endoplasmic reticulum is a site of calcium storage in a variety of cells. The IP3 receptor also functions as a tetrameric Ca2⫹ channel. Binding of IP3 opens the channel, allowing Ca2⫹ ions to diffuse into the cytoplasm (step 9). Calcium ions can also be considered as intracellular or second messengers because they bind to various target molecules, triggering specific responses. In the two examples used above, contraction of a smooth muscle cell and exocytosis of histamine-containing secretory granules in a mast cell, both are triggered by elevated calcium levels. So, too, is the response of a liver cell to the hormone vasopressin (the same hormone that has antidiuretic activity in the kidney, page 151). Vasopressin binds to its receptor at the liver cell surface and causes a series of IP3mediated bursts of Ca2⫹ release, which appear as oscillations

0.4 nM Vasopressin

600

[Ca2+] (nM)

C-␤ (PLC␤) (step 4). Like the protein depicted in Figure 15.9b, PLC␤ is situated at the inner surface of the membrane (Figure 15.10), bound there by the interaction between its PH domain and a phosphoinositide embedded in the bilayer. PLC␤ catalyzes a reaction that splits PI(4, 5)P2 into two molecules, inositol 1,4,5-trisphosphate (IP3) and diacylglycerol (DAG) (step 5, Figure 15.10), both of which play important roles as second messengers in cell signaling. We will examine each of these second messengers separately.

400

200

10

20

30

Time (minutes)

Figure 15.11 Experimental demonstration of changes in free calcium concentration in response to hormone stimulation. A single liver cell was injected with aequorin, a protein extracted from certain jellyfish that luminesces when it binds calcium ions. The intensity of the luminescence is proportional to the concentration of free calcium ions. Exposure of the cell to vasopressin leads to controlled spikes in the concentration of free calcium at periodic intervals. Higher concentrations of hormone do not increase the height (amplitude) of the spikes, but they do increase their frequency. (REPRINTED WITH PERMISSION FROM N. M. WOODS, K. S. CUTHBERTSON, AND P. H. COBBOLD, NATURE 319:601, 1986; COPYRIGHT 1986. NATURE BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

of free cytosolic calcium in the recording shown in Figure 15.11. The frequency and intensity of such oscillations may encode information that governs the cell’s specific response. A list of some of the responses mediated by IP3 is indicated in Table 15.2. We will have more discussion about Ca2⫹ ions in Section 15.5.

The Specificity of G Protein-Coupled Responses A wide variety of agents, including hormones, neurotransmitters, and sensory stimuli, act by way of GPCRs and heterotrimeric G proteins to transmit information across the plasma membrane, triggering a wide variety of cellular responses. Thus as a group, GPCRs are capable of binding a diverse array of ligands. In addition, the receptor for a given ligand can exist in several different versions (isoforms). For example, researchers have identified 9 different isoforms of the adrenergic receptor, which binds epinephrine, and 15 different isoforms of the receptor for serotonin, a powerful neurotransmitter released by nerve cells in parts of the brain. Different isoforms can have different affinities for the ligand or may interact with different types of G proteins. Different isoforms of a receptor may coexist in the same plasma membrane, or they may occur in the membranes of different types of target cells. The heterotrimeric G proteins that transmit signals from receptor to effector can also exist in multiple forms, as can many of the effectors. The human genome encodes at least 16 different G␣

631

Table 15.2 Summary of Cellular Responses Elicited by Adding IP3 to Either Permeabilized or Intact Cells Cell type

Response

Vascular smooth muscle Stomach smooth muscle Slime mold

Contraction Contraction Cyclic GMP formation, actin polymerization Shape change, aggregation Modulation of light response Calcium mobilization, membrane depolarization Membrane depolarization, cortical reaction Increased potassium current

Blood platelets Salamander rods Xenopus oocytes Sea urchin eggs Lacrimal gland

Adapted from M. J. Berridge, reproduced with permission from the Annual Review of Biochemistry, Vol. 56, copyright 1987 Annual review of biochemistry. Volume 56, 1987 by Richardson, Charles C. Reproduced with permission of Annual Reviews, Incorporated in the format Republish in a book via Copyright Clearance Center.

Regulation of Blood Glucose Levels Glucose can be utilized as a source of energy by nearly all cell types present in the body. It is oxidized to CO2 and H2O by glycolysis and the TCA cycle, providing cells with ATP that can be used to drive energy-requiring reactions. The body maintains glucose levels in the bloodstream within a narrow range. As discussed in Chapter 3, excess glucose is stored in animal cells as glycogen, a large branched polymer composed of glucose monomers that are linked through glycosidic bonds. The hormone glucagon is produced by the alpha cells of the pancreas in response to low blood glucose levels. Glucagon stimulates breakdown of glycogen and release of glucose into the bloodstream, thereby causing glucose levels to rise. The hormone insulin is produced by the beta cells of the pancreas in response to high glucose levels and stimulates glucose uptake and storage as glycogen. Finally, epinephrine—

(Glucose)n – 1 + UDP-glucose

CH2OH H

CH2OH O H

H OH

H

HO H

OH n

H O

H OH H

CH2OH O H H

H O

OH

H

H

n–1

O

PPi

OH

UTP

n–2

Glycogen (glucose)n

CH2OH

Pi Glycogen phosphorylase

O H

H OH

O H

H (Glucose)n – 1 +

OH

O

HO H

O P

O-

O-

OH

Glucose 1-phosphate

Glucose 6-phosphate

Fructose 6-phosphate

Glucose + Pi

Glycolysis

Blood

Figure 15.12 The reactions that lead to glucose storage or mobilization. The activities of two of the key enzymes in these reactions, glycogen phosphorylase and glycogen synthase, are controlled by hormones that act through signal transduction pathways. Glycogen phosphorylase is activated in response to glucagon and epinephrine, whereas glycogen synthase is activated in response to insulin (page 646).

which is sometimes called the “fight-or-flight” hormone—is produced by the adrenal gland in stressful situations. Epinephrine causes an increase in blood glucose levels to provide the body with the extra energy resources needed to deal with the stressful situation at hand. Insulin acts through a receptor protein-tyrosine kinase and its signal transduction is discussed on page 644. In contrast, both glucagon and epinephrine act by binding to GPCRs. Glucagon is a small protein that is composed of 29 amino acids, whereas epinephrine is a small molecule that is derived from the amino acid tyrosine. Structurally speaking, these two molecules have nothing in common, yet both of them bind to GPCRs and stimulate the breakdown of glycogen into glucose 1-phosphate (Figure 15.12). In addition, the binding of either of these hormones leads to the inhibition of the enzyme glycogen synthase, which catalyzes the opposing reaction in which glucose units are added to growing glycogen molecules. Thus two different stimuli (glucagon and epinephrine), recognized by different receptors, induce the same response in a single target cell. The two receptors differ from one another primarily in the structure of the ligand-binding pocket on the extracellular surface of the cell, which is specific for one or the other hormone.

15.3 G Protein-Coupled Receptors and Their Second Messengers

subunits, 5 different G␤ subunits, and 11 different G␥ subunits, along with 9 isoforms of the effector adenylyl cyclase. Different combinations of specific subunits construct G proteins having different capabilities of reacting with specific isoforms of both receptors and effectors. As mentioned on page 624, some G proteins act by inhibiting their effectors. The same stimulus can activate a stimulatory G protein (one with a G␣s subunit) in one cell and an inhibitory G protein (one with a G␣i subunit) in a different cell. For example, when epinephrine binds to a ␤-adrenergic receptor on a cardiac muscle cell, a G protein with a G␣s subunit is activated, which stimulates cAMP production, leading to an increase in the rate and force of contraction. In contrast, when epinephrine binds to an ␣-adrenergic receptor on a smooth muscle cell in the intestine, a G protein with a G␣i subunit is activated, which inhibits cAMP production, producing muscle relaxation. Finally, some adrenergic receptors turn on G proteins with G␣q subunits, leading to activation of PLC␤. Clearly, the same extracellular messenger can activate a variety of pathways in different cells.

Glycogen synthase

632

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

Following activation by their respective ligands, both receptors activate the same type of heterotrimeric G proteins that cause an increase in the levels of cAMP. Glucose Mobilization: An Example of a Response Induced by cAMP cAMP is synthesized by adenylyl cyclase, an integral membrane protein whose catalytic domain resides at the inner surface of the plasma membrane (Figure 15.13). cAMP evokes a response that leads to glucose mobilization by initiating a chain of reactions, as illustrated in Figure 15.14. The first step in this reaction cascade occurs as the hormone binds to its receptor, activating a G␣s subunit, which activates an adenylyl cyclase effector. The activated enzyme catalyzes the formation of cAMP (steps 1 and 2, Figure 15.14). Once formed, cAMP molecules diffuse into the cytoplasm where they bind to an allosteric site on a regulatory subunit of a cAMP-dependent protein kinase ( protein kinase A, PKA) (step 3, Figure 15.14). In its inactive form, PKA is a heterotetramer composed of two regulatory (R) and two catalytic (C) subunits. The regulatory subunits normally inhibit the catalytic activity of the enzyme. cAMP binding causes the dissociation of the regulatory subunits, thereby releasing the active catalytic subunits of PKA. The target substrates of PKA in a liver cell include two enzymes that play a pivotal role in glucose metabolism, glycogen synthase and phosphorylase kinase (steps 4 and 5). Phosphorylation of glycogen synthase inhibits its catalytic activity and thus prevents the conversion of glucose to glycogen. In contrast, phosphorylation of phosphorylase kinase activates the enzyme to catalyze the transfer of phosphate groups to glycogen phosphorylase molecules. As discovered by Krebs and Fischer (page 115), the addition of a single phosphate group to a specific serine residue in the glycogen phosphorylase polypeptide activates this enzyme (step 6), stimulating the breakdown of glycogen (step 7). The glucose 1-phosphate formed in the reaction is converted to glucose, which diffuses into the bloodstream and so reaches the other tissues of the body (step 8). As one might expect, a mechanism must exist to reverse the steps discussed above; otherwise the cell would remain in the activated state indefinitely. Liver cells contain phosphatases that remove the phosphate groups added by the kinases. A particular member of this family of enzymes, protein phosphatase-1, can remove phosphates from all of the phosphorylated enzymes of Figure 15.14: phosphorylase kinase, glycogen synthase, and glycogen phosphorylase. The destruction of cAMP molecules present in the cell is accomplished by the enzyme cAMP phosphodiesterase, which helps terminate the response. Signal Amplification The binding of a single hormone molecule at the cell surface can activate a number of G proteins, each of which can activate an adenyl cyclase effector, each of which can produce a large number of cAMP messengers in a short period of time. Thus, the production of a second messenger provides a mechanism to greatly amplify the signal generated from the original message. Many of the steps in the reaction cascade illustrated in Figure 15.14 result in amplification of the signal (these steps are indicated by the

1 2 3 4 5 6

7 8 9 10 11 12

COOH

Active site

N

ATP

cAMP

NH2

NH2

C

C

N

N

C

C

N

C

N

CH

CH –O

P O–

O

P O–

HC

O

O

O

O

P

O

N

C

HC

N

CH2 O

O– H

N

C

C H C

H C

OH

OH

H

O –O

+ PPi

O

H C H O P O

C HH C

C H H C OH

Figure 15.13 Formation of cyclic AMP from ATP is catalyzed by adenylyl cyclase, an integral membrane protein that consists of two parts, each containing six transmembrane helices (shown here in two dimensions). The enzyme’s active site is located on the inner surface of the membrane in a cleft situated between two similar cytoplasmic domains. The breakdown of cAMP (not shown) is accomplished by a phosphodiesterase, which converts the cyclic nucleotide to a 5⬘ monophosphate.

blue arrows). cAMP molecules activate PKA. Each PKA catalytic subunit phosphorylates a large number of phosphorylase kinase molecules, which in turn phosphorylate an even larger number of glycogen phosphorylase molecules, which in turn can catalyze the formation of a much larger number of glucose phosphates. Thus, what begins as a barely perceptible stimulus at the cell surface is rapidly transformed into a major mobilization of glucose within the cell. Other Aspects of cAMP Signal Transduction Pathways Although the most rapid and best-studied effects of cAMP are produced in the cytoplasm, the nucleus and its genes also participate in the response. A fraction of the activated PKA molecules translocate into the nucleus where they phosphorylate key nuclear proteins (step 9, Figure 15.14), most notably a transcription factor called CREB (cAMP response elementbinding protein). The phosphorylated version of CREB binds as a dimer to sites on the DNA (Figure 15.14, step 10) containing a particular nucleotide sequence (TGACGTCA), known as the cAMP response element (CRE). Recall from page 525 that response elements are sites in the DNA where transcription factors bind and increase the rate of initiation of transcription. CREs are located in the regulatory regions of genes that play a role in the response to cAMP. In liver cells,

633 Glucagon epinephrine, others Receptor

1

Gαs

GTP

Adenylyl cyclase 2

Inactive

G protein

Cytoplasm

Active

AMP Phosphodiesterase

CREB

PKA

cAMP

P

Glycogen synthase

Protein kinase A

Nucleus

3

4

9

CREB

Active Phosphatase

Inactive

Active P

Glycogen phosphorylase Inactive

Phosphorylase kinase

5

Phosphatase

Active

8

Inactive

6

Phosphatase

P

CREB CREB CRE mRNA

Active

2–

O

O

Glucose 6– phosphate

P

P

PO2– 3

Glucose

DNA

Inactive

CH2OPO 3 O

10

P

7

Glucose 1– phosphate

Glycogen Plasma membrane

Bloodstream

eral of the hormonal responses mediated by cAMP in mammalian cells are listed in Table 15.3. Cyclic AMP pathways have also been implicated in processes occurring in the nervous system, including learning, memory, and drug addiction. Chronic use of opiates, for example, leads to elevated levels of adenylyl cyclase and PKA, which may be partially responsible for the physiologic responses that occur during drug withdrawal. Another cyclic nucleotide, cyclic GMP, also acts as a second messenger in certain cells as illustrated by the induced

Table 15.3 Examples of Hormone-Induced Responses Mediated by cAMP Tissue

Hormone

Response

Liver

Epinephrine and glucagon

Skeletal muscle Cardiac muscle Adipose Kidney Thyroid Bone Ovary Adrenal cortex

Epinephrine Epinephrine Epinephrine, ACTH, and glucagon Vasopressin (ADH) TSH Parathyroid hormone LH ACTH

Glycogen breakdown, glucose synthesis (gluconeogenesis), inhibition of glycogen synthesis Glycogen breakdown, inhibition of glycogen synthesis Increased contractility Triacylglycerol catabolism Increased permeability of epithelial cells to water Secretion of thyroid hormones Increased calcium resorption Increased secretion of steroid hormones Increased secretion of glucocorticoids

15.3 G Protein-Coupled Receptors and Their Second Messengers

for example, several of the enzymes involved in gluconeogenesis, a pathway by which glucose is formed from the intermediates of glycolysis (see Figure 3.31), are encoded by genes that contain nearby CREs. Thus, epinephrine and glucagon not only activate catabolic enzymes involved in glycogen breakdown, they promote the synthesis of anabolic enzymes required to synthesize glucose from smaller precursors. cAMP is produced in many different cells in response to a wide variety of different ligands (i.e., first messengers). Sev-

Figure 15.14 The response by a liver cell to glucagon or epinephrine. The steps in the response to hormonal stimulation that lead to glucose mobilization are described in the text. Many of the steps in the reaction cascade are accompanied by a dramatic amplification of the signal. Steps leading to amplification are indicated by clusters of blue arrows. Activation of transcription by CREB occurs in conjunction with the usual array of coactivators (e.g., p300 and CBP) and chromatin-modifying complexes (not shown).

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

634

relaxation of smooth muscle cells discussed on page 656. As discussed in the next section, cyclic GMP also plays a key role in the signaling pathway involved in vision. Because cAMP exerts most of its effects by activating PKA, the response of a given cell to cAMP is typically determined by the specific proteins phosphorylated by this kinase (Figure 15.15). Although activation of PKA in a liver cell in response to epinephrine leads to the breakdown of glycogen, activation of the same enzyme in a kidney tubule cell in response to vasopressin causes an increase in the permeability of the membrane to water, and activation of the enzyme in a thyroid cell in response to TSH leads to the secretion of thyroid hormone. Clearly, PKA must phosphorylate different substrates in each of these cell types, thereby linking the increase in cAMP levels induced by epinephrine, vasopressin, and TSH to different physiological responses. Over one hundred PKA substrates have been described. Most of these carry out different functions, which brings up the question as to how PKA phosphorylates the appropriate substrates in response to a particular stimulus, in a particular cell type. This question was answered in part by the observation that different cells express different PKA substrates and in part by the discovery of PKA-anchoring proteins or AKAPs that function as signaling hubs. The first AKAPs were discovered as proteins that co-purified with PKA. At least 50 different AKAPs have been discovered since, several of which are shown in Figure 15.16. As indicated in this figure, AKAPs provide a structural framework or scaffold for coordinating protein–protein interactions by sequestering PKA to specific locations within the cell. As a consequence, PKA accumulates in close proximity to one or more substrates. When cAMP levels rise and PKA is activated, the relevant substrates are present close by and they are the first ones to become phosphorylated. Substrate selection thus is partly a consequence of the subcellular localization of PKA in the presence of particular substrates. Different cells express different AKAPs, resulting in localization of PKA in the presence of different substrates and consequently phosphorylation of different substrates following an increase in cAMP levels. It is interesting to note that, unlike most proteins with a similar function, AKAPs have a diverse structure, suggesting that evolution has co-opted a variety of different types of proteins to carry out a similar role in cell signaling.

The Role of GPCRs in Sensory Perception Our ability to see, taste, and smell depends largely on GPCRs. It was mentioned above that rhodopsin, whose structure and activation is depicted in Figure 15.5b, is a GPCR. Rhodopsin is the light-sensitive protein present in the rods of our retina, which are the photoreceptor cells that respond to low light intensity and provide us with a black-and-white picture of our environment at night or in a darkened room. Several closely related GPCRs are present in the cones of the retina, which provide us with color vision under conditions of brighter light. Absorption of a single photon of light induces a conformational change in the rhodopsin molecule, which transmits a signal to a heterotrimeric G protein (called transducin), which activates a coupled effector. The effector in this case is the

Plasma membrane (transport)

se

na

Ki

Triglyceride lipase (fatty acid formation)

cAMP

Microtubules (assembly/ disassembly) Kinase

Endoplasmic reticulum (protein Kinase synthesis)

Kinase

Glycogen Phosphorylase synthase kinase (glycogen formation)

Kinase

Phosphorylase (glycogen breakdown) Nucleus (DNA synthesis, differentiation, RNA synthesis)

Figure 15.15 Schematic illustration of the variety of processes that can be affected by changes in cAMP concentration. All of these effects are thought to be mediated by activation of the same enzyme, protein kinase A. In fact, the same hormone can elicit very different responses in different cells, even when it binds to the same receptor. Epinephrine, for example, binds to a similar ␤-adrenergic receptor in liver cells, fat cells, and smooth muscle cells of the intestine, causing the production of cAMP in all three cell types. The responses, however, are quite different: glycogen is broken down in the liver cell, triacylglycerols are broken down in the fat cell, and the smooth muscle cells undergo relaxation. In addition to PKA, cAMP is known to interact with ion channels, phosphodiesterases, and GEFs (page 641).

enzyme cGMP phosphodiesterase, which hydrolyzes the cyclic nucleotide cGMP, a second messenger similar in structure to cAMP (Figure 15.13). cGMP plays an important role in visual excitation in the rod cells of the retina. In the dark, cGMP levels remain high and thus capable of binding to cGMP-gated sodium channels in the plasma membrane, keeping the channels in an open configuration, leading to a continued inward ionic current (a “dark current”). Activation of cGMP phosphodiesterase results in lowered cGMP levels, leading to the closure of the sodium channels. This response, which is unusual in that it is triggered by a decrease in the concentration of a second messenger, may lead to the generation of action potentials along the optic nerve. Our sense of smell depends on nerve impulses transmitted along olfactory neurons that extend from the epithelium that lines our upper nasal cavity to the olfactory bulb that is located in our brain stem. The distal tips of these neurons, which are located in the nasal epithelium, contain odorant receptors, which are GPCRs capable of binding various chemicals that enter our nose. Mammalian odorant receptors were first identified in 1991 by Linda Buck and Richard Axel of Columbia University. It is estimated that humans express roughly 400 different odorant receptors that, taken together, are able to combine with a large variety of different chemical

635 L-type Ca2+ channel

AMPA receptor NMDA receptor AKAP18

PKA

Yotiao

AKAP79/150 PKC PKA PP2B

Microtubules

PKA PKA

MAP2

PP1

Centrosome PP1 PKN PP2A AKAP350 PDE PKC PKA

Mitochondrion Glucokinase

PDE PKA mAKAP GSK3␤

PKA PP1 D-AKAP1

BAD

Pericentrin PKC PKA

Nucleus

WAVE1 PKA

PP1

Plasma membrane

Rab32 PKA

Actin

AKAP220 PP1 PKA Vesicles

Abl PKA WAVE1 Rac

PKA AKAP-Lbc Rho PKC PKD

PKA PKC Gravin

Cytoskeleton

a number of different compartments, including the plasma membrane, mitochondrion, cytoskeleton, centrosome, and nucleus. (REPRINTED WITH PERMISSION FROM W. WONG AND J. D. SCOTT, NATURE REVIEWS MOL CELL BIOL 5:961, 2004; COPYRIGHT 2004. NATURE REVIEWS MOLECULAR CELL BIOLOGY BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

structures (odorants).1 Each olfactory receptor neuron expresses only one allele of one of the hundreds of different odorant receptor genes. Consequently, each of these sensory neurons contains only one specific odorant receptor and is only capable of responding to one or a few related chemicals. As a result, activation of different neurons containing different odorant receptors provides us with the perception of different aromas. That does not mean that a single chemical cannot interact with more than one olfactory receptor, but rather that the specific combination of receptors that are activated by that compound may play a key role in producing a particular smell. Mutations in a specific gene encoding a particular odorant receptor can leave a person with the inability to detect a particular chemical in their environment that most other members of the population can perceive. When activated by bound ligands, odorant receptors signal through heterotrimeric G proteins to adenylyl cyclase, resulting in the synthesis of cAMP and the opening of a cAMP-gated cation channel. This response leads to the generation of action potentials that are transmitted to the brain.

Our perception of taste is much less discriminating than our perception of smell. Each taste receptor cell in the tongue transmits a sense of one of only five basic taste qualities, namely: salty, sour, sweet, bitter, or umami (from the Japanese word meaning “flavorful”). Taste receptor cells that elicit the taste of umami respond to the amino acids aspartate and glutamate and to purine nucleotides, generating a perception that a food is “savory.” This is the reason that monosodium glutamate and disodium guanylate are commonly added to processed foods to enhance flavor. The pleasurable umami taste is thought to have evolved as a mechanism to drive mammals to seek high-protein foods. The perception that a food or beverage is salty or sour is elicited directly by sodium ions or protons in the food. These ions pass through the plasma membrane of receptor cells via Na⫹ or H⫹ channels, respectively, eventually leading to a depolarization of the cell’s plasma membrane (page 166). In contrast, the perception that a food is bitter, sweet, or savory depends on a compound interacting with a GPCR at the surface of the receptor cell. Humans encode a family of about 30 bitter-taste receptors called T2Rs, which are coupled to the same heterotrimeric G protein. As a group, these taste receptors bind a diverse array of different compounds, including plant alkaloids or cyanides, that evoke a bitter taste in our mouths. For the most part, substances that evoke this perception are toxic compounds that elicit a distasteful, protective response that causes us to expel

1

The human genome contains roughly 1000 genes that encode odorant receptors but the majority are present as nonfunctional pseudogenes (page 408). Mice, which depend more heavily than humans on their sense of smell, have more than 1000 of these genes in their genome, and 95 percent of them encode functional receptors.

15.3 G Protein-Coupled Receptors and Their Second Messengers

Figure 15.16 A schematic representation of AKAP signaling complexes operating in different subcellular compartments. The AKAP in each of these protein complexes is represented by the purple bar. In each case, the AKAP forms a scaffold that brings together a PKA molecule with potential substrates and other proteins involved in the signaling pathway, including phosphatases (green triangles) that can remove the added phosphate groups and phosphodiesterases that can terminate continued signaling. The AKAPs shown here target PKA to

636

the food matter from our mouth. Unlike olfactory cells that contain a single receptor protein, a single taste-bud cell that evokes a bitter sensation contains a variety of different T2R receptors that respond to unrelated noxious substances. As a result, many diverse substances evoke the same basic taste, which is simply that the food we have eaten is bitter and disagreeable. In contrast, a food that elicits a sweet taste is likely to be one that contains energy-rich carbohydrates. Humans possess only one high affinity sweet-taste receptor (a T1R2T1R3 heterodimer) and it responds to sugars, certain sweet tasting peptides and proteins (e.g., monellin), and artificial sweeteners. Umami receptors consist of a TR1-TR3 heterodimer. Fortunately, food that is chewed releases odorants that travel via the throat to olfactory neurons in our nasal mucosa, allowing the brain to learn much more about the food we have eaten than the relatively simple messages provided by taste receptors. It is this merged input from both olfactory and taste (gustatory) receptors that provides us with our rich sense of taste. The importance of olfactory neurons in our perception of taste becomes more evident when we have a cold that causes us to lose some of our appreciation for the taste of food.

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

REVIEW 1. What is the role of G proteins in a signaling pathway? 2. Describe Sutherland’s experiment that led to the concept of the second messenger. 3. What is meant by the term amplification in regard to signal transduction? How does the use of a reaction cascade result in amplification of a signal? How does it increase the possibilities for metabolic regulation? 4. How is it possible that the same first messenger, such as epinephrine, can evoke different responses in different target cells? That the same second messenger, such as cAMP, can also evoke different responses in different target cells? That the same response, such as glycogen breakdown, can be initiated by different stimuli? 5. Describe the steps that lead from the synthesis of cAMP at the inner surface of the plasma membrane of a liver cell to the release of glucose into the bloodstream. How is this process controlled by GRKs and arrestin? By protein phosphatases? By cAMP phosphodiesterase? 6. Describe the steps between the binding of a ligand such as glucagon to a seven transmembrane receptor and the activation of an effector, such as adenylyl cyclase. How is the response normally attenuated? 7. What is the mechanism of formation of the second messenger IP3? What is the relationship between the formation of IP3 and an elevation of intracellular [Ca2⫹]? 8. Describe the relationship between phosphatidylinositol, diacylglycerol, calcium ions, and protein kinase C. How do phorbol esters interfere with signal pathways involving DAG?

15.4 | Protein-Tyrosine Phosphorylation as a Mechanism for Signal Transduction Protein-tyrosine kinases are enzymes that phosphorylate specific tyrosine residues on protein substrates. Protein-tyrosine phosphorylation is a mechanism for signal transduction that appeared with the evolution of multicellular organisms. Over 90 different protein-tyrosine kinases are encoded by the human genome. These kinases are involved in the regulation of growth, division, differentiation, survival, attachment to the extracellular matrix, and migration of cells. Expression of mutant protein-tyrosine kinases that cannot be regulated and are continually active can lead to uncontrolled cell division and the development of cancer. One type of leukemia, for example, occurs in cells that contain an unregulated version of the protein-tyrosine kinase ABL. Protein-tyrosine kinases can be divided in two groups: Receptor protein-tyrosine kinases (RTKs), which are integral membrane proteins that contain a single transmembrane helix and an extracellular ligand binding domain, and nonreceptor (or cytoplasmic) protein-tyrosine kinases. The human genome encodes nearly 60 RTKs and 32 non-receptor TKs. The first RTK to be studied, EGFR, was identified in 1978 by Stanley Cohen of Vanderbilt University. The discovery of the first non-receptor TK is discussed on page 696. RTKs are activated directly by extracellular growth and differentiation factors such as epidermal growth factor (EGF) and platelet-derived growth factor (PDGF) or by metabolic regulators such as insulin. Non-receptor protein-tyrosine kinases are regulated indirectly by extracellular signals and they control processes as diverse as the immune response, cell adhesion, and neuronal cell migration. This section of the chapter is focused on signal transduction by RTKs. Receptor Dimerization An obvious question comes to mind when considering the mechanics of signal transduction: How is the presence of a growth factor on the outside of the cell translated into biochemical changes inside the cell? Although structural biologists have yet to solve the threedimensional structure of an entire RTK, it is widely accepted on the basis of other techniques that ligand binding results in the dimerization of the extracellular ligand-binding domains of a pair of receptors. Two mechanisms for receptor dimerization have been recognized: ligand-mediated dimerization and receptor-mediated dimerization (Figure 15.17). Early work suggested that ligands of RTKs contain two receptor-binding sites. This made it possible for a single growth or differentiation factor molecule to bind to two receptors at the same time, thereby causing ligand-mediated receptor dimerization (Figure 15.17a). This model was supported by the observation that growth and differentiation factors such as plateletderived growth factor (PDGF) or colony-stimulating factor-1 (CSF-1) are composed of two similar or identical disulfidelinked subunits, in which each subunit contains a receptorbinding site. However, not all growth factors were found to conform to this model. Some growth factors (e.g., EGF or

Ligand

637

Ligand

Inactive monomers

Inactive monomers

Ligand induces dimerization interface

Inactive monomers Ligand-mediated dimerization

Receptor-mediated dimerization

Active dimers

Trans-autophosphorylation

Trans-autophosphorylation

P

P

P

Signal transmission

SH2 or PTB domain

P

P

(a)

Figure 15.17 Steps in the activation of a receptor protein-tyrosine kinase (RTK). (a) Ligand-mediated dimerization. In the nonactivated state, the receptors are present in the membrane as monomers. Binding of a bivalent ligand leads directly to dimerization of the receptor and activation of its kinase activity, causing it to add phosphate groups to the cytoplasmic domain of the other receptor subunit. The newly formed phosphotyrosine residues of the receptor serve as binding sites for target proteins containing either SH2 or PTB domains. The target proteins become activated as a result of their interaction with the receptor. (b) Receptor-mediated dimerization. The sequence of events

Signal transmission

SH2 or PTB domain

P

P

(b)

are similar to those in part a, except that the ligand is monovalent and, consequently, a separate ligand molecule binds to each of the inactive monomers. Binding of each ligand induces a conformational change in the receptor that creates a dimerization interface (red arrows). The ligand-bound monomers interact through this interface to become an active dimer. (BASED ON A DRAWING BY J. SCHLESSINGER AND A. ULLRICH, NEURON 9:384, 1992; BY PERMISSION OF CELL PRESS. NEURON BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

15.4 Protein-Tyrosine Phosphorylation as a Mechanism for Signal Transduction

P

Active dimers

638

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

TGF␣) contain only a single receptor-binding site. Structural work now supports a second mechanism (Figure 15.17b) in which ligand binding induces a conformational change in the extracellular domain of a receptor, leading to the formation or exposure of a receptor dimerization interface. With this mechanism, ligands act as allosteric regulators that turn on the ability of their receptors to form dimers. A small number of RTKs, including the insulin and IGF-1 receptors, are present as inactive dimers in the absence of ligand (see Figure 15.24). For most RTKs, receptor dimerization results in the juxtapositioning of two protein-tyrosine kinase domains on the cytoplasmic side of the plasma membrane. Bringing two kinase domains in close contact allows for trans-autophosphorylation, in which the protein kinase activity of one receptor of the dimer phosphorylates tyrosine residues in the cytoplasmic domain of the other receptor of the dimer, and vice versa (Figure 15.17a,b). Protein Kinase Activation Autophosphorylation sites on RTKs can carry out two different functions: they can regulate the receptor’s kinase activity or serve as binding sites for cytoplasmic signaling molecules. Kinase activity is usually controlled by autophosphorylation on tyrosine residues that are present in the activation loop of the kinase domain. The activation loop, when unphosphorylated, obstructs the substratebinding site, thereby preventing ATP from entering. Following its phosphorylation, the activation loop is stabilized in a position away from the substrate-binding site, resulting in activation of the kinase domain. Once their kinase domain has been activated, the receptor subunits proceed to phosphorylate each other on tyrosine residues that are present in regions adjacent to the kinase domain. It is these autophosphorylation sites that act as binding sites for cellular signaling proteins. Phosphotyrosine-Dependent Protein–Protein Interactions Signaling pathways consist of a chain of signaling proteins that interact with one another in a sequential manner (see Figure 15.3). Signaling proteins are able to associate with activated protein-tyrosine kinase receptors, because such proteins contain domains that bind specifically to phosphorylated tyrosine residues (as in Figure 15.17). The best-studied pTyr-binding domains are the Src-homology 2 (SH2) domain and the phosphotyrosine-binding (PTB) domain. SH2 domains were initially identified as part of proteins encoded by the genome of tumor-causing (oncogenic) viruses. They are composed of approximately 100 amino acids and contain a conserved binding-pocket that accommodates a phosphorylated tyrosine residue (Figure 15.18). More than 110 SH2 domains are encoded by the human genome. They mediate a large number of phosphorylation-dependent protein–protein interactions. These interactions occur following phosphorylation of specific tyrosine residues. The specificity of the interactions is determined by the amino acid sequence immediately adjacent to the phosphorylated tyrosine residues. For example, the SH2 domain of the Src protein-tyrosine kinase recognizes pTyr-Glu-Glu-Ile, whereas the SH2 domains of PI 3-kinase bind to pTyr-Met-X-Met (in which X can be any residue). It is interesting to note that the budding-yeast genome encodes only one SH2-domain-containing protein, which correlates with the overall lack of tyrosine-kinase signaling activity in these lower single-celled eukaryotes.

-1 ASN

+1 GLU +2 GLU

-2 PRO

+3 ILE

+4 PRO P–TYR

Tyr βD5

Arg αA2

BG

βD

Leu BG4

βC αB

βB Arg βB5

Tyr αB9

βA Trp βB1 N

Figure 15.18 The interaction between an SH2 domain of a protein and a peptide containing a phosphotyrosine residue. The SH2 domain of the protein is shown in a cutaway view with the accessible surface area represented by red dots and the polypeptide backbone as a purple ribbon. The phosphotyrosine-containing heptapeptide (Pro-Asn-pTyr-Glu-Glu-Ile-Pro) is shown as a space-filling model whose side chains are colored green and backbone is colored yellow. The phosphate group is shown in light blue. The phosphorylated tyrosine residue and the isoleucine residue (⫹3) are seen to project into pockets on the surface of the SH2 domain, creating a tightly fitting interaction, but only when the key tyrosine residue is phosphorylated. (FROM GABRIEL WAKSMAN ET AL., COURTESY OF JOHN KURIYAN, CELL 72:783, 1993. REPRINTED WITH PERMISSION FROM ELSEVIER.)

PTB domains were discovered more recently. They can bind to phosphorylated tyrosine residues that are usually present as part of an asparagine-proline-X-tyrosine (Asn-Pro-XTyr) motif. The story is more complicated, however, because some PTB domains appear to bind specifically to an unphosphorylated Asn-Pro-X-Tyr motif, whereas others bind specifically to the phosphorylated motif. PTB domains are poorly conserved and different PTB domains possess different residues that interact with their ligands. Activation of Downstream Signaling Pathways We have seen that receptor protein-tyrosine kinases (RTKs) are autophosphorylated on one or more tyrosine residues. A variety of signaling proteins with SH2 or PTB domains are present in the cytoplasm. Receptor activation therefore results in formation of signaling complexes, in which SH2- or PTBcontaining signaling proteins bind to specific autophosphorylation sites present on the receptor (as in Figure 15.17). We can distinguish several groups of signaling proteins that can interact with activated RTKs, including adaptor proteins, docking proteins, transcription factors, and enzymes (Figure 15.19). ■

Adaptor proteins function as linkers that enable two or more signaling proteins to become joined together as part of a signaling complex (Figure 15.19a). Adaptor proteins

639 Figure 15.19 A diversity of signaling proteins. Cells contain numerous proteins with SH2 or PTB domains that bind to phosphorylated tyrosine residues. (a) Adaptor proteins, such as Grb2, function as a link between other proteins. As shown here, Grb2 can serve as a link between an activated growth factor RTK and Sos, an activator of a downstream protein named Ras. The function of Ras is discussed later. (b) The docking protein IRS contains a PTB domain that allows it to bind to the activated receptor. Once bound, tyrosine residues on the docking protein are phosphorylated by the receptor. These phosphorylated residues function as binding sites for other signaling proteins. (c) Certain transcription factors bind to activated RTKs, an event that leads to the phosphorylation and activation of the transcription factor and its translocation to the nucleus. Members of the STAT family of transcription factors become activated in this manner. (d) A wide array of signaling enzymes are activated following binding to an activated RTK. In the case depicted here, a phospholipase (PLC-␥), a lipid kinase (PI3K), and a protein-tyrosine phosphatase (Shp2) have all bound to phosphotyrosine sites on the receptor.

Ligand Active Ras

Ras GTP

SH2 domain P

P

Sos

P

Grb2 (a)

PTB domain IRS P P

P

P

SH2

P P

P

PI3K

Shp2

(b)

P

P

P

STAT family transcription factor

P P

Active STAT dimer

SH3-N

Nucleus (c)

SH3-C

PI-PLCγ P

P

PI3K

P

Shp2

(d)

contain an SH2 domain and one or more additional protein–protein interaction domains. For instance, the adaptor protein Grb2 contains one SH2 and two SH3 (Src-homology 3) domains (Figure 15.20). As shown on

Figure 15.20 Tertiary structure of an adaptor protein, Grb2. Grb2 consists of three parts: two SH3 domains and one SH2 domain. SH2 domains bind to a protein (e.g., the activated EGF receptor) containing a particular motif that includes a phosphotyrosine residue. SH3 domains bind to a protein (e.g., Sos) that contains a particular motif that is rich in proline residues. Dozens of proteins that bear these domains have been identified. Interactions involving SH3 and SH2 domains are shown in Figures 2.40 and 15.18, respectively. Other adaptor proteins include Nck, Shc, and Crk. (FROM SÉBASTIEN MAIGNAN ET AL. SCIENCE 268:291, 1995. REPRINTED WITH PERMISSION FROM AAAS. IMAGE PROVIDED COURTESY OF ARNAUD DUCRUIX.)

page 62, SH3 domains bind to proline-rich sequence motifs. The SH3 domains of Grb2 bind constitutively to other proteins, including Sos and Gab. The SH2 domain

15.4 Protein-Tyrosine Phosphorylation as a Mechanism for Signal Transduction

Kinase activity

640

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

binds to phosphorylated tyrosine residues within a TyrX-Asn motif. Consequently, tyrosine phosphorylation of the Tyr-X-Asn motif on an RTK results in translocation of Grb2-Sos or Grb2-Gab from the cytosol to a receptor, which is present at the plasma membrane (Figure 15.19a). ■

Docking proteins, such as IRS, supply certain receptors with additional tyrosine phosphorylation sites (Figure 15.19b). Docking proteins contain either a PTB domain or an SH2 domain and a number of tyrosine phosphorylation sites. Binding of an extracellular ligand to a receptor leads to autophosphorylation of the receptor, which provides a binding site for the PTB or SH2 domain of the docking protein. Once bound together, the receptor phosphorylates tyrosine residues present on the docking protein. These phosphorylation sites then act as binding sites for additional signaling molecules. Docking proteins provide versatility to the signaling process, because the ability of the receptor to turn on signaling molecules can vary with the docking proteins that are expressed in a particular cell.

■

Transcription factors were discussed at length in Chapter 12. Transcription factors that belong to the STAT family play an important role in the function of the immune system. STATs contain an SH2 domain together with a tyrosine phosphorylation site that can act as a binding site for the SH2 domain of another STAT molecule (Figure 15.19c). Tyrosine phosphorylation of STAT SH2 binding sites situated within a dimerized receptor leads to the recruitment of STAT proteins (Figure 15.19c). Upon association with the receptor complex, tyrosine residues in these STAT proteins are phosphorylated. As a result of the interaction between the phosphorylated tyrosine residue on one STAT protein and the SH2 domain on a second STAT protein, and vice versa, these transcription factors will form dimers. Dimers, but not monomers, move to the nucleus where they stimulate the transcription of specific genes involved in an immune response. The role of STATs in signaling an immune response is discussed in Section 17.4.

■

Signaling enzymes include protein kinases, protein phosphatases, lipid kinases, phospholipases, and GTPase activating proteins. When equipped with SH2 domains, these enzymes associate with activated RTKs and are turned on directly or indirectly as a consequence of this association (Figure 15.19d ). Three general mechanisms have been identified by which these enzymes are activated following their association with a receptor. Enzymes can be activated simply as a result of translocation to the membrane, which places them in close proximity to their substrates. Enzymes can also be activated through an allosteric mechanism (page 115), in which binding to phosphotyrosine results in a conformational change in the SH2 domain that causes a conformational change in the catalytic domain, resulting in a change in catalytic activity. Finally, enzymes can be regulated directly by phosphorylation. As will be described below, signaling proteins that associate with activated RTKs initiate cascades of events that lead to the biochemical changes required to respond to the presence of extracellular messenger molecules.

Ending the Response Signal transduction by RTKs is usually terminated by internalization of the receptor. Exactly what causes receptor internalization remains an area of active research. One mechanism involves a receptor-binding protein named Cbl. When RTKs are activated by ligands, they autophosphorylate tyrosine residues, which can act as a binding site for Cbl, which possesses an SH2 domain. Cbl then associates with the receptor and catalyzes the attachment of a ubiquitin molecule to the receptor. Ubiquitin is a small protein that is linked covalently to other proteins, thereby marking those proteins for internalization (page 312) or degradation (page 542). Binding of the Cbl complex to activated receptors is followed by receptor ubiquitination and internalization. As in the case of GPCRs (Figure 15.7), internalized RTKs can have several alternate fates; they can be degraded in lysosomes, returned to the plasma membrane, or become part of endosomal signaling complexes and engage in continued intracellular signaling. Now that we have discussed some of the general mechanisms by which RTKs are able to activate signaling pathways, we can look more closely at a couple of important pathways that are activated downstream of RTKs. First we will discuss the Ras-MAP kinase pathway, which is probably the best characterized signaling cascade that is turned on by activated protein-tyrosine kinases. A different cascade will be described in the context of the insulin receptor.

The Ras-MAP Kinase Pathway Retroviruses are small viruses that carry their genetic information in the form of RNA. Some of these viruses contain genes, called oncogenes, that enable them to transform normal cells into tumor cells. Ras was originally described as the product of a retroviral oncogene and, only later, determined to be derived from its mammalian host. It was subsequently discovered that approximately 30 percent of all human cancers contain mutant versions of RAS genes. At this point it is important to note that Ras proteins are part of a superfamily of more than 150 small (monomeric) G proteins including the Rabs (page 302), Sar1 (page 296), and Ran (page 492). These proteins are involved in the regulation of numerous processes, including cell division, differentiation, gene expression, cytoskeletal organization, vesicle trafficking, and nucleocytoplasmic transport. The principles discussed in connection with Ras apply to many members of the small G-protein superfamily. Ras is a small GTPase that is anchored at the inner surface of the plasma membrane by a covalently attached lipid group that is embedded in the inner leaflet of the bilayer (Figure 15.19a). Ras is functionally similar to the heterotrimeric G proteins that were discussed earlier and, like those proteins, Ras also acts as both a switch and a molecular timer. Unlike heterotrimeric G proteins, however, Ras consists of only a single small subunit. Ras proteins are present in two different forms: an active GTP-bound form and an inactive GDPbound form (Figure 15.21a). Ras-GTP binds and activates downstream signaling proteins. Ras is turned off by hydrolysis of its bound GTP to GDP. Mutations in one of the human RAS genes that lead to tumor formation prevent the protein

641 GEF

Switch II

Inactive G protein

1b

GDP GDI

GEF

1a

Switch ON

2

GDP

Inactive GDP G protein

GEF Active G protein

6

GDP 5

GAP

GAP

GAP

4

GDP

Switch I Inactive target protein (a)

GEF Active G protein GTP

3

GDI

Inactive G protein

GTP

Clock Switch OFF

GTP

Inactive target protein

GTP Active target protein

Signal transmitted downstream

(b)

bind to a downstream target protein (step 3). Binding to the GTP-bound G protein activates the target protein, which is typically an enzyme such as a protein kinase or a protein phosphatase. This has the effect of transmitting the signal farther downstream along the signaling pathway. G proteins have a weak intrinsic GTPase activity that is stimulated by interaction with a GTPase-activating protein (GAP) (step 4). The degree of GTPase stimulation by a GAP determines the length of time that the G protein is active. Consequently, the GAP serves as a type of clock that regulates the duration of the response (step 5). Once the GTP has been hydrolyzed, the complex dissociates, and the inactive G protein is ready to begin a new cycle (step 6). (A: FROM STEVEN J. GAMBLIN AND STEPHEN J. SMERDON, STRUCT. 7:R200, 1999. REPRINTED WITH PERMISSION FROM ELSEVIER.)

from hydrolyzing the bound GTP back to the GDP form. As a result, the mutant version of Ras remains in the “on” position, sending a continuous message downstream along the signaling pathway, keeping the cell in the proliferative mode. The cycling of monomeric G proteins, such as Ras, between active and inactive states is aided by accessory proteins that bind to the G protein and regulate its activity (Figure 15.21b). These accessory proteins include

leased, the G protein rapidly binds a GTP, which is present at relatively high concentration in the cell, thereby activating the G protein. 3. Guanine nucleotide-dissociation inhibitors (GDIs). GDIs are proteins that inhibit the release of a bound GDP from a monomeric G protein, thus maintaining the protein in the inactive, GDP-bound state.

1. GTPase-activating proteins (GAPs). Most monomeric G

proteins possess some capability to hydrolyze a bound GTP, but this capability is greatly accelerated by interaction with specific GAPs. Because they stimulate hydrolysis of the bound GTP, which inactivates the G protein, GAPs dramatically shorten the duration of a G proteinmediated response. Mutations in one of the Ras-GAP genes (NF1) cause neurofibromatosis 1, a disease in which patients develop large numbers of benign tumors (neurofibromas) along the sheaths that line the nerve trunks. 2. Guanine nucleotide-exchange factors (GEFs). An inactive G protein is converted to the active form when the bound GDP is replaced with a GTP. GEFs are proteins that bind to an inactive monomeric G protein and stimulate dissociation of the bound GDP. Once the GDP is re-

The activity and localization of these various accessory proteins are tightly regulated by other proteins, which thus regulate the state of the G protein. Ras-GTP can be thought of as a signaling hub because it can interact directly with several downstream targets. Here we will discuss Ras as an element of the Ras-MAP kinase cascade. The Ras-MAP kinase cascade is turned on in response to a wide variety of extracellular signals and plays a key role in regulating vital activities such as cell proliferation and differentiation. The pathway relays extracellular signals from the plasma membrane through the cytoplasm and into the nucleus. The overall outline of the pathway is depicted in Figure 15.22. This pathway is activated when a growth factor, such as EGF or PDGF, binds to the extracellular domain of its RTK. Many activated RTKs possess phosphorylated tyrosine residues that act as docking sites for the adaptor protein Grb2.

15.4 Protein-Tyrosine Phosphorylation as a Mechanism for Signal Transduction

Figure 15.21 The structure of a G protein and the G protein cycle. (a) Comparison of the tertiary structure of the active GTP-bound state (red) and inactive GDP-bound state (green) of the small G protein Ras. A bound guanine nucleotide is depicted in the ball-and-stick form. The differences in conformation occur in two flexible regions of the molecule known as switch I and switch II. The difference in conformation shown here affects the molecule’s ability to bind to other proteins. (b) The G protein cycle. G proteins are in their inactive state when they are bound by a molecule of GDP. If the inactive G protein interacts with a guanine nucleotide dissociation inhibitor (GDI), release of the GDP is inhibited and the protein remains in the inactive state (step 1a). If the inactive G protein interacts with a guanine nucleotide exchange factor (GEF; step 1b), the G protein exchanges its GDP for a GTP (step 2), which activates the G protein so that it can

642 Growth factor 1

P

2

Receptor PTK

Receptor

P P

3

Grb2

Sos

4

Ras-GDP

Ras-GTP 5

Soluble Raf

P

Membrane-bound

MAPKKK

Raf

6

MEK

P

MEK

MAPKK

P

7

ERK

ERK

MAPK P

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

11 8

TF

P

TF

P TF

Gene

9 10

Figure 15.22 The steps of a generalized MAP kinase cascade. Binding of growth factor to its receptor (step 1) leads to the autophosphorylation of tyrosine residues of the receptor (step 2) and the subsequent recruitment of the Grb2-Sos proteins (step 3). This complex causes the GTP-GDP exchange of Ras (step 4), which recruits the protein Raf to the membrane, where it is phosphorylated and thus activated (step 5). In the pathway depicted here, Raf phosphorylates and activates another kinase named MEK (step 6), which in turn phosphorylates and activates still another kinase termed ERK (step 7). This three-step phosphorylation scheme shown in steps 5–7 is characteristic of all MAP kinase cascades. Because of their sequential kinase activity, Raf is known as a MAPKKK (MAP kinase kinase kinase), MEK as a MAPKK (MAP kinase kinase), and ERK as a MAPK (MAP kinase). MAPKKs are dual-specificity kinases, a term denoting that they can phosphorylate tyrosine as well as serine and threonine residues. All MAPKs have a tripeptide near their catalytic site with the sequence ThrX-Tyr. MAPKK phosphorylates MAPK on both the threonine and tyrosine residue of this sequence, thereby activating the enzyme (step 7). Once activated, MAPK translocates into the nucleus where it phosphorylates transcription factors (TF, step 8), such as Elk-1. Phosphorylation of the transcription factors increases their affinity for regulatory sites on the DNA (step 9), leading to an increase in the transcription of specific genes (e.g., Fos and Jun) involved in the growth response. One of the genes whose expression is stimulated encodes a MAPK phosphatase (MKP-1; step 10). Members of the MKP family can remove phosphate groups from both tyrosine and threonine residues of MAPK (step 11), which inactivates MAPK and stops further signaling activity along the pathway. (H. SUN AND N. K. TONKS, TRENDS BIOCHEM SCIENCE 19:484, 1994. TRENDS IN BIOCHEMICAL SCIENCES BY INTERNATIONAL UNION OF BIOCHEMISTRY REPRODUCED WITH PERMISSION OF ELSEVIER LTD. IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

Transcription

MKP-1

Grb2, in turn, binds to Sos, which is a guanine nucleotide exchange factor (a GEF) for Ras. Creation of a Grb2-binding site on an activated receptor promotes the translocation of Grb2-Sos from the cytoplasm to the cytoplasmic surface of the plasma membrane, placing Sos in close proximity to Ras (as in Figure 15.19a). Simply bringing Sos to the plasma membrane is sufficient to cause Ras activation. This was illustrated by an experiment with a mutant version of Sos that is permanently tethered to the inner surface of the plasma membrane. Expression of this membrane-bound Sos mutant results in constitutive activation of Ras and transformation of the cell to a malignant phenotype. Interaction with Sos opens the Ras nucleotidebinding site. As a result, GDP is released and is replaced by GTP. Exchange of GDP for GTP in the nucleotide-binding site of Ras results in a conformational change and the creation of a binding interface for a number of proteins, including an important signaling protein called Raf. Raf is then recruited

to the inner surface of the plasma membrane where it is activated by a combination of phosphorylation and dephosphorylation reactions. Raf is a serine-threonine protein kinase. One of its substrates is the protein kinase MEK (Figure 15.22). MEK, which is activated as a consequence of phosphorylation by Raf, goes on to phosphorylate and activate two MAP kinases named ERK1 and ERK2. Over 160 proteins that can be phosphorylated by these kinases have been identified, including transcription factors, protein kinases, cytoskeletal proteins, apoptotic regulators, receptors, and other signaling proteins. Once activated, the MAP kinase is able to move into the nucleus where it phosphorylates and activates a number of transcription factors and other nuclear proteins. Eventually, the pathway leads to the activation of genes involved in cell proliferation, including cyclin D1, which plays a key role in driving a cell from G1 into S phase (Figure 14.8). As discussed in the following chapter, oncogenes are identified by their ability to cause cells to become cancerous. Oncogenes are derived from normal cellular genes that have either become mutated or are overexpressed. Many of the proteins that play a part in the Ras signaling pathway were discovered because they were encoded by cancer-causing oncogenes. This includes the genes for Ras, Raf, and a number of the transcriptional factors generated at the end of the pathway (e.g., Fos and Jun). Genes for several of the RTKs situated at the beginning of the pathway, including the receptors for both EGF and PDGF, have also been identified

643

among the many known oncogenes. The fact that so many proteins in this pathway are encoded by genes that can cause cancer when mutated emphasizes the importance of the pathway in the control of cell growth and proliferation.

Mating factor

Mating factor

High salt Sho1

Ste2

Ste2

Plasma membrane Gβγ

Chimeric Ste5-Pbs2 scaffold Ste11

SH3 domain

Ste11 MAPKKK Ste5

Ste7

MAPKK

Fus3

MAPK

Pbs2

Gβγ Ste11 Ste7

Pbs2 kinase

Hog1

Hog1

Transcription factors activated

Transcription factors activated

Transcription factors activated

Mating genes activated

Osmoresponsive genes activated

Osmoresponsive genes activated

Mating response

Osmoregulatory response

Osmoregulatory response

(a)

(b)

(c)

Figure 15.23 The roles of scaffolding proteins in mediating two yeast MAPK pathways. (a) The MAPK pathway that regulates mating in these cells is elicited by a mating factor that binds to a GPCR, Ste2, leading to the activation of a G␤␥ which binds to the scaffolding protein Ste5, which in turn binds the MAPKKK, MAPKK, and MAPK proteins of the pathway, (b) The MAPK pathway that regulates the yeast osmoregulatory response in cells exposed to high salt. The activated receptor (Sho1) binds to the Pbs2 scaffolding protein by its SH3 domain. The MAPKKK Ste11 is shared in these two pathways but is recruited into one or the other response by virtue of its interaction with the appropriate protein scaffold. The scaffold Pbs2 does not recruit a separate MAPKK, but has its own MAPKK enzymatic activity. (c) When cells are genetically engineered to express a chimeric Ste5-Pbs2 scaffold, they respond to a mating factor by exhibiting the osmoregulatory response. (See Science 332:680, 2011, for a discussion of scaffold proteins and this experiment.)

events. For example, they can induce a change in conformation of bound signaling proteins, leading to their activation or inhibition. A few scaffolding proteins are known to have an enzymatic role, as illustrated by the MAPKK activity of the yeast Pbs2 scaffold shown in Figure 15.23b. In addition to facilitating a particular series of reactions, scaffolding proteins may prevent proteins involved in one signaling pathway from participating in other pathways. As a result, several pathways can share the same limited set of signaling proteins without compromising specificity. This is the case for the yeast MAPKKK protein Ste11 shown in Figure 15.23a,b, which participates in both the mating and osmoregulatory response depending upon which scaffolding protein it has interacted with. The importance of scaffolding proteins is well illustrated by an experiment in which parts of two different yeast MAPK-cascade scaffolding proteins (Ste5 and Pbs2) were genetically combined to form a chimeric protein (Ste5-Pbs2) (Figure 15.23c). Normally these two scaffolds mediate two

15.4 Protein-Tyrosine Phosphorylation as a Mechanism for Signal Transduction

Adapting the MAP Kinase to Transmit Different Types of Information The same basic pathway from RTKs through Ras to the activation of transcription factors, as illustrated in Figure 15.22, is found in all eukaryotes investigated, from yeast through flies and nematodes to mammals. Evolution has adapted the pathway to meet many different ends. In yeast, for example, the MAP kinase cascade is required for cells to respond to mating factors; in fruit flies, the pathway is utilized during the differentiation of the photoreceptors in the compound eye; and in flowering plants, the pathway transmits signals that initiate a defense against pathogens. In each case, the core of the pathway contains a trio of enzymes that act sequentially: a MAP kinase kinase kinase (MAPKKK), a MAP kinase kinase (MAPKK), and a MAP kinase (MAPK) (Figure 15.22). Each of these components is represented in a particular organism by a small family of proteins. To date, 14 different MAPKKKs, 7 different MAPKKs, and 13 different MAPKs have been identified in mammals. By utilizing different members of these protein families, mammals are able to assemble a number of different MAP kinase pathways that transmit different types of extracellular signals. We have already described how mitogenic stimuli are transmitted along one type of MAP kinase pathway that leads to cell proliferation. In contrast, when cells are exposed to stressful stimuli, such as X-rays or damaging chemicals, signals are transmitted along different MAP kinase pathways that cause the cell to withdraw from the cell cycle, rather than progressing through it as indicated in Figure 15.22. Withdrawal from the cell cycle gives the cell time to repair the damage resulting from the adverse conditions. Recent studies have focused on the signaling specificity of MAP kinase cascades in an attempt to understand how cells are able to utilize similar proteins as components of pathways that elicit different cellular responses. Studies of amino acid sequences and protein structures suggest that part of the answer lies in selective interactions between enzymes and substrates. For example, certain members of the MAPKKK family phosphorylate specific members of the MAPKK family, which in turn phosphorylate specific members of the MAPK family. But many members of these families can participate in more than one MAPK signaling pathway. Specificity in MAP kinase pathways is also achieved by spatial localization of the component proteins. Spatial localization is accomplished by structural (i.e., nonenzymatic) proteins referred to as scaffolding proteins, whose apparent function is to tether the appropriate members of a signaling pathway in a specific spatial orientation that enhances their mutual interactions. The AKAPs depicted in Figure 15.16 are examples of scaffolding proteins involved in cAMP-driven pathways. Another group of scaffolding proteins, such as the yeast proteins shown in Figure 15.23a, b, play a role in routing signals through one of various MAP kinase pathways. In some cases, scaffolding proteins can take an active role in signaling

644

Our bodies spend considerable effort maintaining blood glucose levels within a narrow range. A decrease in blood glucose levels can lead to loss of consciousness and coma, as the central nervous system depends largely on glucose for its energy metabolism. A persistent elevation in blood glucose levels results in a loss of glucose, fluids, and electrolytes in the urine and serious health problems. The levels of glucose in the circulation are monitored by the pancreas. When blood glucose levels fall below a certain level, the alpha cells of the pancreas secrete glucagon. As discussed earlier, glucagon acts through GPCRs and stimulates the breakdown of glycogen resulting in an increase in blood glucose levels. When glucose levels rise, as occurs after a carbohydrate-rich meal, the beta cells of the pancreas respond by secreting insulin. Insulin functions as an extracellular messenger molecule, informing cells that glucose levels are high. Cells that express insulin receptors on their surface, such as cells in the liver, respond to this message by increasing glucose uptake, increasing glycogen and triglyceride synthesis, and/or decreasing gluconeogenesis.

between the ␣ chains. Thus, while most RTKs are thought to be present on the cell surface as monomers, insulin receptors are present as stable dimers. Like other RTKs, insulin receptors are inactive in the absence of ligand (Figure 15.24a). Recent work suggests that the insulin receptor dimer binds a single insulin molecule. This causes repositioning of the ligand-binding domains on the outside of the cell, which causes the tyrosine kinase domains on the inside of the cell to come into close physical proximity (Figure 15.24b). Juxtaposition of the kinase domains leads to trans-autophosphorylation and receptor activation (Figure 15.24c). Several tyrosine phosphorylation sites have been identified in the cytoplasmic region of the insulin receptor. Three of these phosphorylation sites are present in the activation loop. In the unphosphorylated state, the activation loop assumes a conformation in which it occupies the active site. Upon phosphorylation of the three tyrosine residues, the activation loop assumes a new conformation away from the catalytic cleft. This new conformation requires a rotation of the small and large lobes of the kinase domain with respect to each other, thereby bringing residues that are essential for catalysis closer together. In addition, the activation loop now leaves the catalytic cleft open so that it can bind substrates. Following activation of the kinase domain, the receptor phosphorylates itself on tyrosine residues that are present adjacent to the membrane and in the carboxyl-terminal tail (Figure 15.24c).

The Insulin Receptor Is a Protein-Tyrosine Kinase Each insulin receptor is composed of an ␣ and a ␤ chain, which are derived from a single precursor protein by proteolytic processing. The ␣ chain is entirely extracellular and contains the insulin-binding site. The ␤ chain is composed of an extracellular region, a single transmembrane region, and a cytoplasmic region (Figure 15.24). The ␣ and ␤ chains are linked together by disulfide bonds (Figure 15.24). Two of these ␣␤ heterodimers are held together by disulfide bonds

Insulin Receptor Substrates 1 and 2 Most RTKs possess autophosphorylation sites that directly recruit SH2 domain-containing signaling proteins (as in Figure 15.19a, c, and d ). The insulin receptor is an exception to this general rule, because it associates instead with a small family of docking proteins (Figure 15.19b), called insulin-receptor substrates (IRSs). The IRSs, in turn, provide the binding sites for SH2 domain-containing signaling proteins. Some of the events that occur during insulin signaling are shown in Figure

different MAPK signaling pathways (Figure 15.23a,b). When yeast cells containing the chimeric protein were exposed to a mating factor that normally stimulates the mating response, the cells responded by displaying the osmoregulatory response.

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

Signaling by the Insulin Receptor

Insulin α chain

Disulfide bond

β chain

P

P

P

P

P

P

Tyrosine kinase domain (a)

(b)

Figure 15.24 The response of the insulin receptor to ligand binding. (a) The insulin receptor, shown here in schematic form in the inactive state, is a tetramer consisting of two ␣ and two ␤ subunits. (b) In this model, binding of a single insulin molecule to the ␣ subunits causes a conformational change in the ␤ subunits, which activates the

(c)

tyrosine kinase activity of the ␤ subunits. (c) The activated ␤ subunits phosphorylate tyrosine residues located on the cytoplasmic domain of the receptor as well as tyrosine residues on several insulin receptor substrates (IRSs) that are discussed below.

645 PI3K

Grb2

YY

Y

YYY YY

PTB

Shp2

YY Y Y Y YY

PH

N

PI(4,5)P2

Y Y C

PI 3-kinase

PI(3,4,5)P3

PI-5-P'tase

PI(3,4)P2

PTEN

(a)

Plasma membrane

Insulin receptor PTB domain

5'

P

P

P

P

PIP3

Protein synthesis

P P

3'

Ras GTP P

Grb2

P

P

P

3

P

4

5

P

P

PH domain

Sos

IRS-1 Glucose uptake

P P

Other signaling proteins (b)

P

Activated PKB

P

P

mTOR PI3K (2 subunits)

PDK1

Glycogen synthesis

mTORC2 complex

(c)

illustration.) (c) Activation of PI3K leads to the formation of membrane-bound phosphoinositides, including PIP3. One of the key kinases in numerous signaling pathways is PKB (AKT), which interacts at the plasma membrane with PIP3 by means of a PH domain. This interaction changes the conformation of PKB, making it a substrate for another PIP3-bound kinase (PDK1), which phosphorylates PKB. The second phosphate shown linked to PKB is added by a second kinase, mostly likely mTOR. Once activated, PKB dissociates from the plasma membrane and moves into the cytosol and nucleus. PKB is a major component of a number of separate signaling pathways that mediate the insulin response. These pathways lead to translocation of glucose transporters to the plasma membrane, synthesis of glycogen, and the synthesis of new proteins in the cell. PKB also plays a key role in promoting cell survival by inhibiting the proapoptotic protein Bad (page 659) and/or activating the transcription factor NF-␬B (page 660).

15.25. Following ligand binding and kinase activation, the insulin receptor autophosphorylates tyrosine #960, which then forms a binding site for the phosphotyrosine binding (PTB) domains of insulin receptor substrates. As indicated in Figure 15.25a, IRSs are characterized by the presence of an N-terminal PH domain, a PTB domain, and a long tail containing tyrosine phosphorylation sites. The PH domain may interact with phospholipids present at the inside leaflet of the plasma membrane, the PTB domain binds to tyrosine phosphorylation sites on the activated receptor, and the tyrosine phosphorylation sites provide docking sites for SH2 domain-containing signaling proteins. At least four members of the IRS family have been identified. Based on the results obtained in knockout experiments in mice, it is thought that IRS-1 and IRS-2 are most relevant to insulin-receptor signaling. Autophosphorylation of the activated insulin receptor at Tyr960 provides a binding site for IRS-1 or IRS-2. Only after stable association with either IRS-1 or IRS-2 is the activated insulin receptor able to phosphorylate tyrosine residues pres-

ent on these docking proteins (Figure 15.25b). Both IRS-1 and IRS-2 contain a large number of potential tyrosine phosphorylation sites that include binding sites for the SH2 domains of PI 3-kinase, Grb2, and Shp2 (Figure 15.25a,b). These proteins associate with the receptor-bound IRS-1 or IRS-2 and activate downstream signaling pathways. PI 3-kinase (PI3K ) is composed of two subunits, one containing two SH2 domains and the other containing the catalytic domain (Figure 15.25b). PI 3-kinase, which is activated directly as a consequence of binding of its two SH2 domains to tyrosine phosphorylation sites, phosphorylates phosphoinositides at the 3 position of the inositol ring (Figure 15.25c). The products of this enzyme, which include PI 3, 4-bisphosphate PI(3,4)P2 and PI 3,4,5-trisphosphate (PIP3), remain in the cytosolic leaflet of the plasma membrane where they provide binding sites for PH domain-containing signaling proteins such as the serine-threonine kinases PKB and PDK1. As indicated in Figure 15.25c, PKB (more commonly known as AKT) plays a role in mediating the response to

15.4 Protein-Tyrosine Phosphorylation as a Mechanism for Signal Transduction

Figure 15.25 The role of tyrosine-phosphorylated IRS in activating a variety of signaling pathways. (a) Schematic representation of an IRS polypeptide. The N-terminal portion of the molecule contains a PH domain that allows it to bind to phosphoinositides of the membrane and a PTB domain that allows it to bind to a specific phosphorylated tyrosine residue (#960) on the cytoplasmic domain of an activated insulin receptor. Once bound to the insulin receptor, a number of tyrosine residues in the IRS may be phosphorylated (indicated as Y ). These phosphorylated tyrosines can serve as binding sites for other proteins, including a lipid kinase (PI3K), an adaptor protein (Grb2), and a protein-tyrosine phosphatase (Shp2). (b) Phosphorylation of IRSs by the activated insulin receptor is known to activate PI3K and Ras pathways, both of which are discussed in the chapter. Other pathways that are less well defined are also activated by IRSs. (The IRS is drawn as an extended, two-dimensional molecule for purposes of

646

Glucose Transport PKB is directly involved in regulating glucose transport and glycogen synthesis. The glucose transporter GLUT4 carries out insulin-dependent glucose transport from the blood (page 157). In the absence of insulin, GLUT4 is present in membrane vesicles in the cytoplasm of insulin responsive cells (Figure 15.26). These vesicles fuse with the plasma membrane in response to insulin, a process that is referred to as GLUT4 translocation. The increase in numbers of glucose transporters in the plasma membrane leads to increased glucose uptake (Figure 15.26). GLUT4 translocation depends on activation of PI 3-kinase and PKB. This conclusion is based on experiments showing that inhibitors of PI3K block GLUT4 translocation. In addition, overexpression of PI3K or PKB stimulates GLUT4 translocation. It is well known that many receptors activate PI3K, whereas it is only the insulin receptor that stimulates GLUT4 translocation. This suggests that there is a second pathway downstream of the insulin receptor that is essential for GLUT4 translocation to occur. Detailed understanding of how the two pathways work together to stimulate GLUT4 translocation is still lacking. Excess glucose that is taken up by muscle and liver cells is stored in the form of glycogen. Glycogen synthesis is carried out by glycogen synthase, an enzyme that is turned off by phosphorylation on serine and threonine residues. Glycogen synthase kinase-3 (GSK-3) has been identified as a negative

regulator of glycogen synthase. GSK-3, in turn, is inactivated following phosphorylation by PKB. Thus, activation of the PI 3-kinase-PKB pathway in response to insulin leads to a decrease in GSK-3 kinase activity, resulting in an increase in glycogen synthase activity (Figure 15.25c). Activation of protein phosphatase 1, an enzyme known to dephosphorylate glycogen synthase, contributes further to glycogen synthase activation (Figure 15.14). Diabetes Mellitus One of the most common human diseases, diabetes mellitus, is caused by defects in insulin signaling. Diabetes occurs in two varieties: type 1, which accounts for 5–10 percent of the cases, and type 2, which accounts for the remaining 90–95 percent. Type 1 diabetes is caused by an inability to produce insulin and is discussed in the Human Perspective of Chapter 17. Type 2 diabetes is a more complex disease whose incidence is increasing around the world at an alarming rate. The rising incidence of the disease is most likely a result of changing lifestyle and eating habits. A highcalorie diet combined with a sedentary lifestyle is thought to lead to a chronic increase in insulin secretion. Elevated levels of insulin overstimulate target cells in the liver and elsewhere in the body, which leads to a condition referred to as insulin resistance, in which these target cells stop responding to the presence of the hormone. This is turn leads to a chronic elevation in blood glucose levels, which stimulates the pancreas to secrete even more insulin, setting up a viscious cycle that can ultimately lead to the death of the insulin-secreting beta cells of the pancreas. Most of the health risks that result from diabetes—cardiovascular disease, blindness, kidney disease, and reduced circulation in the limbs leading to amputations— are thought to be due to damage to the body’s blood vessels, but the molecular mechanism by which insulin resistance and its consequent metabolic effects lead to this condition remain the subject of debate. The relationship between insulin signaling pathways and lifespan is discussed in the accompanying Human Perspective.

Plasma membrane

Glucose

GLUT4 IR

IRS-1

Figure 15.26 Regulation of glucose uptake in muscle and fat cells by insulin. Glucose transporters are stored in the walls of cytoplasmic vesicles that form by budding from the plasma membrane (endocytosis). When the insulin level increases, a signal is transmitted through the IRS-PI3K-PKB pathway, which triggers the translocation of cytoplasmic vesicles to the cell periphery. The vesicles fuse with the plasma membrane (exocytosis), delivering the transporters to the cell surface where they can mediate glucose uptake. A second pathway leading from the insulin receptor to GLUT4 translocation is not shown (see Trends Biochem. Sci. 31:215, 2006). (D. VOET AND J. G. VOET, BIOCHEMISTRY, 2E; COPYRIGHT 1995, JOHN WILEY & SONS, INC. REPRINTED BY PERMISSION OF JOHN WILEY & SONS, INC.)

Fusion

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

insulin, as well as to other extracellular signals. Recruitment of PDK1 to the plasma membrane, in close proximity to PKB, provides a setting in which PDK1 can phosphorylate and activate the Ser/Thr kinase activity of PKB (Figure 15.25c). While phosphorylation by PDK1 is essential, it is not sufficient for activation of PKB. Activation of PKB also depends on phosphorylation by a second kinase, mTOR, which has a crucial role in regulating numerous cellular activities. PI3K signaling is terminated by removal of the phosphate at the 3position on the inositol ring by the lipid phosphatase PTEN (Figure 15.25c).

PI3K

GLUT4 PDKI PKB

Cytoplasmic vesicle

647

T H E

H U M A N

P E R S P E C T I V E

Signaling Pathways and Human Longevity reduced, which suggests a decrease in production of reactive oxygen species (page 35). A number of early studies demonstrated that the lifespan of a worm or fruit fly can also be dramatically increased by reducing the activity of insulin-like growth factors and their receptors. Studies of humans support this relationship; humans that live exceptionally long lives often exhibit unusually high insulin sensitivity—that is, their tissues respond fully to relatively low circulating insulin levels. Low insulin levels are also linked to reduced incidence of cancer. Thus, just as high insulin levels and increased insulin resistance are associated with poor health, low insulin levels and increased insulin sensitivity appear to be associated with good health. It is interesting to note that calorie restriction in laboratory animals leads to decreased insulin levels and increased insulin sensitivity, so that these two paths to increased longevity may be acting by the same mechanism. Although the medical community has typically focused on insulin as the primary metabolic hormone, many basic researchers in the field of aging have focused on the related hormone insulingrowth factor 1 (IGF-1). In one study, it was found that mutations in the gene encoding the IGF-1 receptor are especially frequent in a group of centenarians (humans living over the age of 100). Both insulin and IGF-1 share the downstream effector, mTOR. mTOR became a central focus of the field of mammalian aging when it was

(a)

(b)

Figure 1 The effects of calorie restriction on Rhesus macaques. Photographs of (a) a typical control animal at 27.6 years of age (about the average life span) and (b) an age-matched animal on a calorierestricted diet. (Contasting results from the NIA study can be found in an advanced online publication in the 8/30/2012 issue of Nature.) (FROM R. J. COLMAN, ET AL., SCIENCE 325:201, 2009. REPRINTED WITH PERMISSION FROM AAAS. COURTESY RICKI COLMAN, WISCONSIN NATIONAL PRIMATE RESEARCH CENTER, UNIVERSITY OF WISCONSIN.)

15.4 Protein-Tyrosine Phosphorylation as a Mechanism for Signal Transduction

Many factors are known to contribute to the aging process, some genetic and others nongenetic. Discussions of aging have appeared in several places in the text: in the Human Perspective on free radicals (page 35), in the Human Perspective on mitochondrial diseases (page 208); in the Human Perspective on DNA repair deficiencies (page 569); in the section on the nuclear lamina (page 490); and in the discussion of telomeres (page 508). In recent years, a new contributor to the aging process has received attention: the activity of a signaling pathway involving insulin and a related protein IGF-1, which is the focus of the present Human Perspective. The lifespans of animals can be increased by restricting the calories present in the diet. As first shown in the 1930s, mice that are maintained on very strict diets typically live 30 to 40 percent longer than their littermates who are fed diets of normal caloric content. Two separate long-term studies are currently in progress on rhesus monkeys to see if they too live longer and healthier lives when maintained on calorie-restricted diets. Significant differences in the published data between these two groups have made it difficult to draw firm conclusions on the value of calorie restriction (CR) in primates. One team of researchers at the Wisconsin National Primate Research Center reported in 2009 that animals in their CR group have lower blood levels of glucose, insulin, and triglycerides and were less prone to age-related disorders such as diabetes and coronary artery disease. The effect of calorie restriction on the external appearance of one of these animals is seen in the photographs of Figure 1. The Wisconsin group also reported that 37 percent of the control group (i.e., animals that had enjoyed unrestricted diets) had died during the 20 years of the study compared to only 13 percent of the CR group. In contrast, the other team of researchers at the National Institute of Aging reported in late 2012 that calorie restriction did not improve survival outcome. In fact, individuals in their CR group did not exhibit the reduced cardiovascular disease that characterized the CR individuals in the Wisconsin study. One important difference between the CR and control groups in the NIA study was noted: none of the animals in the CR group had died from cancer as compared to five deaths from cancer from those in the control group. It has been suggested that differences in animal survival between the two studies may be explained by the fact that individuals of the control group in the Wisconsin study were allowed to eat unlimited amounts of high-sucrose-containing foods, which may have made them less healthy than the control animals in the NIA study. Neither of these studies has been conducted long enough to determine if the animals’ maximum life span (more than 40 years) is increased as a result of CR. As reported in numerous television news shows, a growing number of humans are hoping to extend their life span by practicing calorie restriction, which in essence means that they are willing to subject themselves to an extremely limited, but balanced, diet. The National Institutes of Aging has also begun a study (named CALERIE) on human subjects who are overweight (but not obese) that are kept on diets containing about 25 percent fewer calories than would be required to maintain their initial body weight. After a period of six months of calorie restriction, these individuals show remarkable metabolic changes; they have a lower body temperature, their blood insulin and LDL-cholesterol levels are lower, they have lost weight as would be expected, and their energy expenditure is reduced beyond that expected due simply to their lower body mass. In addition, the level of DNA damage experienced by the cells of these individuals is

648 discovered that rapamycin, an inhibitor of the mTOR kinase, significantly extended the lifespan of mice and decreased the incidence of age-related disorders. This is the first compound that has been shown to increase life span in mammals, and it has also has this capability when given to yeast, worms, and flies. Unfortunately, rapamycin is also a potent suppressor of the immune system so it is not itself considered to be a viable anti-aging drug. Calorie restriction has also been shown to reduce signaling through the mTOR pathway. These findings support the notion that mTOR plays an important role in the aging process. mTOR is a nutrient sensor and a primary regulator of cellular metabolism. mTOR is a protein kinase that exists as a component of two distinct complexes mTORC1 and mTORC2. mTORC1, which is especially sensitive to rapamycin, is activated by the availability of nutrients, especially amino acids, and can stimulate

lipid and protein synthesis, inhibit autophagy, and promote cell growth and proliferation. Studies suggest that reduced nutrient availability, as occurs during calorie restriction, reduces mTORC1 signaling. A number of proteins both upstream and downstream of mTOR have been implicated in regulating lifespan, including S6K1 (which phosphorylates numerous proteins involved in protein synthesis, thereby enhancing mRNA translation), the protein deacetylase Sir2 (which removes acetyl groups from histones and nonhistone proteins), Atg proteins (which regulate autophagy), and the transcription factor FOXO (which activates expression of genes whose encoded proteins include molecular chaperones and proteins that play a role in defense against oxidative stress). Untangling the roles of these various components has proven to be very difficult and there is considerable debate as to how mTOR inhibition increases lifespan.

this affected by the activity of a Ras-GAP? How does Ras differ from a heterotrimeric G protein? 3. What is an SH2 domain, and what role does it play in signaling pathways? 4. How does the MAP kinase cascade alter the transcriptional activity of a cell? 5. What is the relationship between type 2 diabetes and insulin production? How is it that a drug that increases insulin sensitivity might help treat this disease?

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

Signaling Pathways in Plants Plants and animals share certain basic signaling mechanisms, including the use of Ca2⫹ and phosphoinositide messengers, but other pathways are unique to each major kingdom. For example, cyclic nucleotides, which may be the most ubiquitous animal cell messengers, appear to play little, if any, role in plant cell signaling. Receptor tyrosine kinases are also lacking in plant cells. On the other hand, plants contain a type of protein kinase that is absent from animal cells. It has long been known that bacterial cells have a protein kinase that phosphorylates histidine residues and mediates the cell’s response to a variety of environmental signals. Until 1993, these enzymes were thought to be restricted to bacterial cells but were then discovered in both yeast and flowering plants. In both types of eukaryotes, the enzymes are transmembrane proteins with an extracellular domain that acts as a receptor for external stimuli and a cytoplasmic, histidine kinase domain that transmits the signal to the cytoplasm. One of the best studied of these plant proteins is encoded by the Etr1 gene. The product of the Etr1 gene encodes a receptor for the gas ethylene (C2H4), a plant hormone that regulates a diverse array of developmental processes, including seed germination, flowering, and fruit ripening. Binding of ethylene to its receptor leads to transmission of signals along a pathway that is very similar to the MAP kinase cascade found in yeast and animal cells. As in other eukaryotes, the downstream targets of the MAP kinase pathway in plants are transcription factors that activate expression of specific genes encoding proteins required for the hormone response. As researchers analyze the massive amount of data obtained from sequencing Arabidopsis and other plant genomes, the similarities and differences between plant and animal signaling pathways should become more apparent.

REVIEW 1. Describe the steps between the binding of an insulin molecule at the surface of a target cell and the activation of the effector PI3K. How does the action of insulin differ from other ligands that act by means of receptor tyrosine kinases? 2. What is the role of Ras in signaling pathways? How is

15.5 | The Role of Calcium as an Intracellular Messenger Calcium ions play a significant role in a remarkable variety of cellular activities, including muscle contraction, immune responses, cell division, secretion, fertilization, synaptic transmission, metabolism, transcription, cell movement, and cell death. In each of these cases, an extracellular message is received at the cell surface and leads to a dramatic increase in concentration of calcium ions within the cytosol. The concentration of calcium ions in a particular cellular compartment is controlled by the regulated activity of Ca2⫹ pumps, Ca2⫹ exchangers, and/or Ca2⫹ ion channels located within the membranes that surround the compartment (as in Figure 15.28). The concentration of Ca2⫹ ions in the cytosol of a resting cell is maintained at very low levels, typically about 10⫺7 M. In contrast, the concentration of this ion in the extracellular space or within the lumen of the ER or a plant cell vacuole is typically 10,000 times higher than the cytosol. The cytosolic calcium level is kept very low because (1) Ca2⫹ ion channels in both the plasma and ER membranes are normally kept closed, making these membranes highly impermeable to this ion, and (2) energy-driven Ca2⫹ transport systems of the plasma and ER membranes pump calcium out of the cytosol.2 Abnormal elevation of cytosolic Ca2⫹ concentration, as can occur in brain cells following a stroke, can lead to massive cell death. 2

Mitochondria also play an important role in sequestering and releasing Ca2⫹ ions, but their role was considered in Chapter 5 and will not be discussed here.

649 2ⴙ

IP3 and Voltage-Gated Ca Channels We have described in previous pages two major types of signaling receptors, GPCRs and RTKs. It was noted on page 630 that interaction of an extracellular messenger molecule with a GPCR can lead to the activation of the enzyme phospholipase C-␤, which splits the phosphoinositide PIP2, to release the molecule IP3, which opens calcium channels in the ER membrane, leading to a rise in cytosolic [Ca2⫹]. Extracellular messengers that signal through RTKs can trigger a similar response. The primary difference is that RTKs activate members of the phospholipase C-␥ subfamily, which possess an SH2 domain that allows them to bind to the activated, phosphorylated RTK. There are numerous other PLC isoforms. For example, PLC␦ is actived by Ca2⫹ ions, and PLC⑀ is activated by Ras-GTP. All PLC isoforms carry out the same reaction, producing IP3 and linking a multitude of cell surface receptors to an increase in cytoplasmic Ca2⫹. There is another major route leading to elevation of cytosolic [Ca2⫹], which was encountered in our discussion of synaptic transmission on page 169. In this case, a nerve impulse leads to a depolarization of the plasma membrane, which triggers the opening of voltage-gated calcium channels in the plasma membrane, allowing the influx of Ca2⫹ ions from the extracellular medium.

Figure 15.27 Experimental demonstration of localized release of intracellular Ca2ⴙ within a single dendrite of a neuron. The mechanism of IP3-mediated release of Ca2⫹ from intracellular stores was described on page 630. In the micrograph shown here, which pictures an enormously complex Purkinje cell (neuron) of the cerebellum, calcium ions have been released locally within a small portion of the complex “dendritic tree.” Calcium release from the ER (shown in red) was induced in the dendrite following the local production of IP3, which followed repetitive activation of a nearby synapse. The sites of release of cytosolic Ca2⫹ ions are revealed by fluorescence from a fluorescent calcium indicator that was loaded into the cell prior to stimulation of the cell. (FROM ELIZABETH A. FINCH AND GEORGE J. AUGUSTINE, NATURE, VOL. 396, COVER OF 12/24/98. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

IP3 receptors described earlier are one of two main types of Ca2⫹ ion channels present in the ER membrane; the other type are called ryanodine receptors (RyRs) because they bind the toxic plant alkaloid ryanodine. Ryanodine receptors are found primarily in excitable cells and are best studied in cardiac and skeletal muscle cells, where they mediate the rise in Ca2⫹ levels following the arrival of an action potential. Mutations in the cardiac RyR isoform have been linked to occurrences of sudden death during periods of exercise. Depending on the type of cell in which they are found, RyRs can be opened by a variety of agents, including calcium itself. The influx of a limited amount of calcium through open channels in the plasma membrane induces the opening of ryanodine

15.5 The Role of Calcium as an Intracellular Messenger

Visualizing Cytoplasmic Ca2ⴙ Concentration in Real Time Our understanding of the role of Ca2⫹ ions in cellular responses has been greatly advanced by the development of indicator molecules that emit light in the presence of free calcium. In the mid-1980s, new types of highly sensitive, fluorescent, calcium-binding compounds (e.g., fura-2) were developed in the laboratory of Roger Tsien at the University of California, San Diego. These compounds are synthesized in a form that can enter a cell by diffusing across its plasma membrane. Once inside a cell, the compound is modified to a form that is unable to leave the cell. Using these probes, the concentration of free calcium ions in different parts of a living cell can be determined over time by monitoring the light emitted using a fluorescence microscope and computerized imaging techniques. Use of calcium-sensitive, light-emitting molecules has provided dramatic portraits of the complex spatial and temporal changes in free cytosolic calcium concentration that occur in a single cell in response to various types of stimuli. This is one of the advantages of studying calcium-mediated responses compared to responses mediated by other types of messengers whose location in a cell cannot be readily visualized. Depending on the type of responding cell, a particular stimulus may induce repetitive oscillations in the concentration of free calcium ions, as seen in Figure 15.11; cause a wave of Ca2⫹ release that spreads from one end of the cell to the other (see Figure 15.29); or trigger a localized and transient release of Ca2⫹ in one part of the cell. Figure 15.27 shows a Purkinje cell, a type of neuron in the mammalian cerebellum that maintains synaptic contact with thousands of other cells through an elaborate network of postsynaptic dendrites. The micrograph in Figure 15.27 shows the release of free calcium in a localized region of the “dendritic tree” of the cell following synaptic activation. The burst of calcium release remains restricted to this region of the cell.

650 Voltage-gated Ca+ channel

Na+/Ca2+ exchanger

Ca2+ 3Na+

High [Ca2+]

Low [Ca2+]

1

Plasma membrane

1Ca2+

5

Calcium ion

2

Ryanodine receptor (RyR)

High [Ca2+]

3

4

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

Low [Ca2+]

SER membrane Ca2+

Ca2+ pump of the SER (SERCA)

Figure 15.28 Calcium-induced calcium release, as it occurs in a cardiac muscle cell. A depolarization in membrane voltage causes the opening of voltage-gated calcium channels in the plasma membrane, allowing entry of a small amount of Ca2⫹ into the cytosol (step 1). The calcium ions bind to ryanodine receptors in the SER membrane (step 2), leading to release of stored Ca2⫹ into the cytosol (step 3), which triggers the cell’s contraction. The calcium ions are subsequently removed from the cytosol by the action of Ca2⫹ pumps located in the membrane of the SER (step 4) and a Na⫹/Ca2⫹ secondary transport system in the plasma membrane (step 5), which leads to relaxation. This cycle is repeated after each heart beat. (REPRINTED WITH PERMISSION AFTER M. J. BERRIDGE, NATURE 361:317, 1993; COPYRIGHT 1993. NATURE BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

Figure 15.29 Calcium wave in a starfish egg induced by a fertilizing sperm. The unfertilized egg was injected with a calcium-sensitive fluorescent dye, fertilized, and photographed at 10-second intervals. The rise in Ca2⫹ concentration is seen to spread from the point of sperm entry (arrow) throughout the entire egg. The blue color indicates low

receptors in the ER, causing the release of Ca2⫹ into the cytosol (Figure 15.28). This phenomenon is called calcium-induced calcium release (CICR). Extracellular signals that are transmitted by Ca2⫹ ions typically act by opening a small number of Ca2⫹ ion channels at the cell surface at the site of the stimulus. As Ca2⫹ ions rush through these channels and enter the cytosol, they act on nearby Ca2⫹ ion channels in the ER, causing these channels to open and release additional calcium into adjacent regions of the cytosol. In some responses, the elevation of Ca2⫹ levels remains localized to a small region of the cytosol (as in Figure 15.27). In other cases, a propagated wave of calcium release spreads through the entire cytoplasmic compartment. One of the most dramatic Ca2⫹ waves occurs within the first minute or so following fertilization and is induced by the sperm’s contact with the plasma membrane of the egg (Figure 15.29). The sudden rise in cytoplasmic calcium concentration following fertilization triggers a number of events, including activation of cyclin-dependent kinases (page 575) that drive the zygote toward its first mitotic division. Calcium waves are transient because the ions are rapidly pumped out of the cytosol and back into the ER and/or the extracellular space. Recent research in the field of calcium signaling has focused on a phenomenon known as store-operated calcium entry (or SOCE), in which the “store” refers to the calcium ions stored in the ER. During periods of repeated cellular responses, the stockpile of intracellular calcium ions can become depleted. During SOCE the depletion of calcium levels in the ER triggers a response leading to the opening of calcium channels in the plasma membrane as depicted in Figure 15.30. Once these channels have opened, Ca2⫹ ions can enter the cytosol from where they can be pumped back into the ER, thereby replenishing the ER’s calcium stores. The mechanism responsible for SOCE had been an unsolved mystery for many years until it was discovered recently that these events are orchestrated by a signaling system operating between the ER and plasma membrane. In this system, the depletion of Ca2⫹ in the ER leads to the clustering within the ER membrane of a Ca2⫹-sensing protein called STIM1 into regions where the ER and plasma membranes come into close proximity (25–50 nm) to one another. Following their rearrangement in the ER membrane, the STIM1 clusters act to recruit

free [Ca2⫹], whereas the red color indicates high free [Ca2⫹]. A similar Ca2⫹ wave in mammalian eggs is triggered by the formation of IP3 by a phospholipase C that is brought into the egg by the fertilizing sperm. (COURTESY OF STEPHEN A. STRICKER.)

651 Plasma membrane

Orai1 (closed)

Cytosol

STIM1

Ca2+

ER membrane Depletion of

Figure 15.30 A model for store-operated calcium entry. When the ER lumen contains abundant Ca2⫹ ions, the STIM1 proteins of the ER membrane and the Orai1 proteins of the plasma membrane are situated diffusely in their respective membranes, and the Orai1 calcium channel is closed. If the ER stores are depleted, a signaling system operates between the two membranes, causing the two proteins to become clustered within their respective membranes in close proximity to one another. Apparent interaction between the two membrane proteins leads to opening of the Orai1 channel and the influx of Ca2⫹ ions into the cytosol from where they can be pumped into the ER lumen.

subunits of a plasma membrane protein called Orail into adjacent regions of the plasma membrane (Figure 15.30). Orai1 is a tetrameric Ca2⫹ ion channel that had been identified as being involved in a particular type of inherited human immune deficiency that results from a lack of Ca2⫹ stores in T lymphocytes. Contact between the cytosolic surfaces of the STIM1 and Orai1 proteins in these ER-plasma membrane junctions leads to the opening of the Orai1 channels, the influx of Ca2⫹ into microdomains of the cytosol near the STIM1 clusters, and the refilling of the cell’s ER stores.

Orai1 (open)

Cytosol

nucleotide phosphodiesterase, ion channels, or even to the calcium-transport system of the plasma membrane. In the latter instance, rising levels of calcium activate the system re-

Table 15.4 Examples of Mammalian Proteins Activated by Ca2ⴙ Protein

Protein function

Troponin C Calmodulin

Modulator of muscle contraction Ubiquitous modulator of protein kinases and other enzymes (MLCK, CaM kinase II, adenylyl cyclase I) Activator of guanylyl cylase Phosphatase Protease Generator of IP3 and diacylglycerol Actin-bundling protein Implicated in endo- and exocytosis, inhibition of PLA2 Producer of arachidonic acid Ubiquitous protein kinase Actin-severing protein Effector of intracellular Ca2⫹ release Effector of intracellular Ca2⫹ release Effector of the exchange of Ca2⫹ for Na⫹ across the plasma membrane Pumps Ca2⫹ across membranes Exchanger of Ca2⫹ for monovalent ions Regulator of muscle contraction Actin organizer Terminator of photoreceptor response Ca2⫹ buffer

Calretinin, retinin Calcineurin B Calpain PI-specific PLC ␣-Actinin Annexin Phospholipase A2 Protein kinase C Gelsolin IP3 receptor Ryanodine receptor Na⫹/Ca2⫹ exchanger Ca2⫹-ATPase Ca2⫹ antiporters Caldesmon Villin Arrestin Calsequestrin

Adapted from D. E. Clapham, Cell 80:260, 1995, by copyright permission of Cell Press. Cell by Cell Press. Reproduced with permission of Cell Press in the format reuse in a book/textbook via Copyright Clearance Center.

15.5 The Role of Calcium as an Intracellular Messenger

Ca2ⴙ-Binding Proteins Unlike cAMP, whose action is usually mediated by stimulation of a protein kinase, calcium can affect a number of different types of cellular effectors, including protein kinases (Table 15.4). Depending on the cell type, calcium ions can activate or inhibit various enzyme and transport systems, change the ionic permeability of membranes, induce membrane fusion, or alter cytoskeletal structure and function. Calcium does not bring about these responses by itself but acts in conjunction with a number of calcium-binding proteins (examples are discussed on pages 303 and 370). The best-studied calcium-binding protein is calmodulin, which participates in many signaling pathways. Calmodulin is found universally in plants, animals, and eukaryotic microorganisms, and it has virtually the same amino acid sequence from one end of the eukaryotic spectrum to the other. Each molecule of calmodulin (Figure 15.31) contains four binding sites for calcium. Calmodulin does not have sufficient affinity for Ca2⫹ to bind the ion in a nonstimulated cell. If, however, the Ca2⫹ concentration rises in response to a stimulus, the ions bind to calmodulin, changing the conformation of the protein and increasing its affinity for a variety of effectors. Depending on the cell type, the calcium–calmodulin (Ca2⫹–CaM) complex may bind to a protein kinase, a cyclic

Ca2+ stores

652

Stomatal pore Guard cell

(a) Guard cell H2O

H2O

H2O

H2O

H2O

H2O

H2O

Closure of stomatal pore H2O

H2O

+

H2O

+

K

K 2+

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

Ca

Figure 15.31 Calmodulin. A ribbon diagram of calmodulin (CaM) with four bound calcium ions (white spheres). Binding of these Ca2⫹ ions changes the conformation of calmodulin, exposing a hydrophobic surface that promotes interaction of Ca2⫹–CaM with a large number of target proteins. (COURTESY MICHAEL CARSON, UNIVERSITY OF ALABAMA AT BIRMINGHAM.)

sponsible for ridding the cell of excess quantities of the ion, thus constituting a self-regulatory mechanism for maintaining low intracellular calcium concentrations. The Ca2⫹–CaM complex can also stimulate gene transcription through activation of various protein kinases (CaMKs) that phosphorylate transcription factors. In the best-studied case, one of these protein kinases phosphorylates CREB on the same serine residue as PKA (Figure 15.14).

Regulating Calcium Concentrations in Plant Cells Calcium ions (acting in conjunction with calmodulin) are important intracellular messengers in plant cells. The levels of cytosolic calcium change dramatically within certain plant cells in response to a variety of stimuli, including changes in light, pressure, gravity, and the concentration of plant hormones such as abscisic acid. The concentration of Ca2⫹ in the cytosol of a resting plant cell is kept very low by the action of transport proteins situated in the plasma membrane and vacuolar membrane (tonoplast). The role of Ca2⫹ in plant cell signaling is illustrated by guard cells that regulate the diameter of the microscopic pores

1

+

K Influx channel (closed)

+

4 3a

3b 2+

−

−

Plasma Cl , NO3 membrane

K Efflux channel (open)

Anion Efflux channel (open)

Ca

2

Tonoplast

Vacuole (b)

Figure 15.32 A simplified model of the role of Ca2ⴙ in guard cell closure. (a) Photograph of stomatal pores, each flanked by a pair of guard cells. The stomata are kept open as turgor pressure is kept high within the guard cells, causing them to bulge outward as seen here. (b) One of the factors controlling stomatal pore size is the hormone abscisic acid (ABA). When ABA levels rise, calcium ion channels in the plasma membrane are opened, allowing the influx of Ca2⫹ (step 1), which triggers the release of Ca2⫹ from internal stores (step 2). The subsequent elevation of intracellular [Ca2⫹] closes K⫹ influx channels (step 3a) and opens K⫹ and anion efflux channels (step 3b). These ion movements lead to a drop in internal solute concentration and the osmotic loss of water (step 4). (Phosphorylation by protein kinases also plays a role in these events.) (A: DR. JEREMY BURGESS/PHOTO RESEARCHERS, INC.)

(stomata) of a leaf (Figure 15.32a). Stomata are a major site of water loss in plants, and the diameter of their aperture is tightly controlled, which prevents dehydration. The diameter of the stomatal pore decreases as the fluid (turgor) pressure in the guard cell decreases. The drop in turgor pressure is caused

653

in turn by a decrease in the ionic concentration (osmolarity) of the guard cell. Adverse conditions, such as high temperatures and low humidity, stimulate the release of the plant stress hormone abscisic acid. Studies suggest that abscisic acid binds to a GPCR in the plasma membrane of guard cells, triggering the opening of Ca2⫹ ion channels in the same membrane (Figure 15.32b). The resulting influx of Ca2⫹ into the cytosol triggers the release of additional Ca2⫹ from intracellular stores. The elevated cytosolic Ca2⫹ concentration leads to closure of K⫹ influx channels in the plasma membrane and opening of both K⫹ and anion efflux channels. These changes produce a net outflow of K⫹ ions and anions (NO3⫺ and Cl⫺) and a resulting decrease in turgor pressure.

REVIEW 1. How is the [Ca2⫹] of the cytosol maintained at such a low level? How does the concentration change in response to stimuli? 2. What is the role of calcium-binding proteins such as calmodulin in eliciting a response? 3. Describe the role of calcium in mediating the diameter of stomata in guard cells.

The signaling pathways described above and illustrated schematically in the various figures depict linear pathways leading directly from a receptor at the cell surface to an end

■

■

■

Signals from a variety of unrelated receptors, each binding to its own ligand, can converge to activate a common effector, such as Ras or Raf. Signals from the same ligand, such as EGF or insulin, can diverge to activate a variety of different effectors and pathways, leading to diverse cellular responses. Signals can be passed back and forth between different pathways, a phenomenon known as cross-talk.

These characteristics of cell-signaling pathways are illustrated schematically in Figure 15.33. Signaling pathways provide a mechanism for routing information through a cell, not unlike the way the central nervous system routes information to and from the various organs of the body. Just as the central nervous system collects information about the environment from various sense organs, the cell receives information about its environment through the activation of various surface receptors, which act like sensors to detect extracellular stimuli. Like sense organs that are sensitive to specific forms of stimuli (e.g., light, pressure, or sound waves), cell-surface receptors can bind only to specific ligands and are unaffected by the presence of a large variety of unrelated molecules. A single cell may have dozens of different receptors sending signals to the cell interior simultaneously. Once they have been transmitted into the cell, signals from these receptors can be selectively routed along a number of different signaling pathways that may cause a cell to divide, change shape, activate a particular metabolic pathway, or even commit suicide (discussed in a following section). In this way, the cell integrates information arriving from different sources and mounts an appropriate and comprehensive response.

G Protein-coupled receptors Acetylcholine, histamine NA, 5-HT, ATP, PAF, TXA2, Glutamate, Angiotensin II, Vasopressin, Bradykinin, Substance P, Bombesin, Neuropeptide Y, Thrombin, Cholecystokinin, Endothelin, Neuromedin, TRH, GnRH, PTH Odorants, Light

R III

G II I

PLCβ

IP3

IP3R

PLCγ

DAG

PKC

Ca2+

PIP2 Cellular activity & mitogenesis

P

Tyrosine kinase-linked receptors

PDGF, EGF, etc.

PIP3

P

PI3K

GAP

Ras

Raf

MAP kinase

Figure 15.33 Examples of convergence, divergence, and cross-talk among various signaltransduction pathways. This drawing shows the outlines of signal-transduction pathways initiated by receptors that act by means of both heterotrimeric G proteins and receptor proteintyrosine kinases. The two are seen to converge by the activation of different phospholipase C isoforms, both of which lead to the production of the same second messengers (IP3 and DAG). Activation of the RTK by either PDGF or EGF leads to the transmission of signals along three different pathways, an example of divergence. Cross-talk between the two types of pathways is illustrated by calcium ions, which are released from the SER by action of IP3 and can then act on various proteins, including protein kinase C (PKC), whose activity is also stimulated by DAG. (M. J. BERRIDGE, REPRINTED WITH PERMISSION FROM NATURE VOL. 361, P. 315, 1993, COPYRIGHT 1993. NATURE BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

15.6 Convergence, Divergence, and Cross-Talk Among Different Signaling Pathways

15.6 | Convergence, Divergence, and Cross-Talk Among Different Signaling Pathways

target. In actual fact, signaling pathways in the cell are much more complex. For example:

654

Examples of Convergence, Divergence, and Cross-Talk Among Signaling Pathways

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

1. Convergence. We have discussed two distinct types of

cell-surface receptors in this chapter: G protein-coupled receptors and receptor tyrosine kinases. Another type of cell-surface receptor that is capable of signal transduction was discussed in Chapter 7, namely, integrins. Although these three types of receptors may bind to very different ligands, all of them can lead to the formation of phosphotyrosine docking sites for the SH2 domain of the adaptor protein Grb2 in close proximity to the plasma membrane (Figure 15.34). The recruitment of the Grb2-Sos complex results in the activation of Ras and transmission of signals down the MAP kinase pathway. As a result of this convergence, signals from diverse receptors can lead to the transcription and translation of a similar set of growthpromoting genes in each target cell. 2. Divergence. Evidence of signal divergence has been evident in virtually all of the examples of signal transduction that have been described in this chapter. A quick look at Figure 15.15 or 15.25b, c illustrates how a single stimulus—a ligand binding to a GPCR or an insulin receptor—sends signals out along a variety of different pathways. 3. Cross-talk. In previous sections, we examined a number of signaling pathways as if each operated as an independent, linear chain of events. In fact, the information circuits that operate in cells are more likely to resemble an interconnected web in which components produced in one pathway can participate in events occurring in other path-

ways. The more that is learned about information signaling in cells, the more cross-talk between signaling pathways that is discovered. Rather than attempting to catalog the ways that information can be passed back and forth within a cell, we will look at an example involving cAMP that illustrates the importance of this type of cross-talk. Cyclic AMP was depicted earlier as an initiator of a reaction cascade leading to glucose mobilization. However, cAMP can also inhibit the growth of a variety of cells, including fibroblasts and fat cells, by blocking signals transmitted through the MAP kinase cascade. Cyclic AMP is thought to accomplish this by activating PKA, the cAMPdependent kinase, which can phosphorylate and inhibit Raf, the protein that heads the MAP kinase cascade (Figure 15.35). These two pathways also intersect at another important signaling effector, the transcription factor CREB. CREB was described on page 632 as a terminal effector of cAMP-mediated pathways. It was assumed for years that CREB could only be phosphorylated by the cAMP-dependent kinase, PKA. It has since become apparent that CREB is a substrate of a much wider range

EGF Growth receptor factor Epinephrine β-adrenergic adrenergic receptor

Ras Ras GTP GTP T

SH2 P

Soss Sos

P

cAMP cAM Extracellular matrix

Neurotransmitters, hormones, growth factors

rb2 Grb2

Growth factors, e.g.,EGF, PDGF

PKA

+ Raf

MAPKK

Integrin

G proteincoupled receptor

P

P

Receptor tyrosine kinase

–

MAPK

+ Sos Grb2

Sos

Sos Grb2

Rsk-2 kinase

PKA

Grb2 P

P

CREB CREB CRE Ras

MAP kinase cascade

Figure 15.34 Signals transmitted from a G protein-coupled receptor, an integrin, and a receptor tyrosine kinase all converge on Ras and are then transmitted along the MAP kinase cascade.

CREB transcription factor Gene activity

Figure 15.35 An example of cross-talk between two major signaling pathways. Cyclic AMP acts in some cells, by means of the cAMP-dependent kinase PKA, to block the transmission of signals from Ras to Raf, which inhibits the activation of the MAP kinase cascade. In addition, both PKA and the kinases of the MAP kinase cascade phosphorylate the transcription factor CREB on the same serine residue, activating the transcription factor and allowing it to bind to specific sites on the DNA.

655

of kinases. For example, one of the kinases that phosphorylates CREB is Rsk-2, which is activated as a result of phosphorylation by MAPK (Figure 15.35). In fact, both PKA and Rsk-2 phosphorylate CREB on precisely the same amino acid residue, Ser133, which should endow the transcription factor with the same potential in both pathways. A major unanswered question is raised by these examples of convergence, divergence, and cross-talk: How are different stimuli able to evoke distinct responses, even though they utilize similar pathways? PI3K, for example, is an enzyme that is activated by a remarkable variety of stimuli, including cell adhesion to the ECM, insulin, and EGF. How is it that activation of PI3K in an insulin-stimulated liver cell promotes GLUT4 translocation and protein synthesis, whereas activation of PI3K in an adherent epithelial cell promotes cell survival? Ultimately, these contrasting cellular responses must be due to differences in the protein composition of different cell types. Part of the answer probably lies in the fact that different cells have different isoforms of these various proteins, including PI3K. Some of these isoforms are encoded by different, but related genes, whereas others are generated by alternative splicing (page 534), or other mechanisms. Different isoforms of PI3K, PKB, or PLC, for example, may bind to different sets of upstream and downstream components, which could allow similar pathways to evoke distinct responses. The variation in responses elicited by different cells possessing similar signaling proteins may also be partly explained by the presence of different protein scaffolds in each of the cell types. As shown in Figure 15.23, the specificity of a response can be orchestrated by the scaffolds with which the signaling proteins can interact. But it isn’t likely that variations in isoforms and scaffolds can fully explain the extraordinary diversity of cellular responses any more than differences in the structures of neurons can explain the range of responses evoked by the nervous system. Hopefully, as the signaling pathways of more and more cells are described, we will gain a better understanding of the specificity that can be achieved through the use of similar signaling molecules.

accidental observation. It had been known for many years that acetylcholine acts in the body to relax the smooth muscle cells of blood vessels, but the response could not be duplicated in vitro. When portions of a major blood vessel such as the aorta were incubated in physiologic concentrations of acetylcholine in vitro, the preparation usually showed little or no response. In the late 1970s, Robert Furchgott, a pharmacologist at a New York State medical center, was studying the in vitro response of pieces of rabbit aorta to various agents. In his earlier studies, Furchgott used strips of aorta that had been dissected from the organ. For technical reasons, Furchgott switched from strips of aortic tissue to aortic rings and discovered that the new preparations responded to acetylcholine by undergoing relaxation. Further investigation revealed that the strips had failed to display the relaxation response because the delicate endothelial layer that lines the aorta had been rubbed away during the dissection. This surprising finding suggested that the endothelial cells were somehow involved in the response by the adjacent muscle cells. In subsequent studies, it was found that acetylcholine binds to receptors on the surface of endothelial cells, leading to the production and release of an agent that diffuses through the cell’s plasma membrane and causes the muscle cells to relax. The diffusible relaxing agent was identified in 1986 as nitric oxide by Louis Ignarro at UCLA and Salvador Moncada at the Wellcome Research Labs in England. The steps in the acetylcholine-induced relaxation response are illustrated in Figure 15.36.

Endothelial cell 1

ACh

Ca2+ Nitric oxide synthase

2

15.7 | The Role of NO as an Intercellular Messenger During the 1980s, a new type of messenger was discovered that was neither an organic compound, such as cAMP, nor an ion, such as Ca2⫹; it was an inorganic gas—nitric oxide (NO). NO is unusual because it acts both as an extracellular messenger, mediating intercellular communication, and as a second messenger, acting within the cell in which it is generated. NO is formed from the amino acid L-arginine in a reaction that requires oxygen and NADPH and that is catalyzed by the enzyme nitric oxide synthase (NOS). Since its discovery, it has become evident that NO is involved in a myriad of biological processes including anticoagulation, neurotransmission, smooth muscle relaxation, and visual perception. As with many other biological phenomena, the discovery that NO functions as a messenger molecule began with an

Guanylyl cyclase

GTP 5

Lumen Arg

cGMP

3

Relaxation

NO

6

4

GTP Guanylyl cyclase Relaxation

Smooth muscle cells

cGMP

Figure 15.36 A signal transduction pathway that operates by means of NO and cyclic GMP that leads to the dilation of blood vessels. The steps illustrated in the figure are described in the text. (FROM R. G. KNOWLES AND S. MONCADA, TRENDS BIOCHEM SCIENCE 17:401, 1992. TRENDS IN BIOCHEMICAL SCIENCES BY INTERNATIONAL UNION OF BIOCHEMISTRY REPRODUCED WITH PERMISSION OF ELSEVIER LTD. IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

15.7 The Role of NO as an Intercellular Messenger

Ca2+

656

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

The binding of acetylcholine to the outer surface of an endothelial cell (step 1, Figure 15.36) signals a rise in cytosolic Ca2⫹ concentration (step 2) that activates nitric oxide synthase (step 3). The NO formed in the endothelial cell diffuses across the plasma membrane and into the adjacent smooth muscle cells (step 4), where it binds and stimulates guanylyl cyclase (step 5), the enzyme that synthesizes cyclic GMP (cGMP), which is an important second messenger similar in structure to cAMP. Cyclic GMP binds to a cGMP-dependent protein kinase (a PKG), which phosphorylates specific substrates causing relaxation of the muscle cell (step 6) and dilation of the blood vessel. NO as an Activator of Guanylyl Cyclase The discovery that NO acts as an activator of guanylyl cyclase was made in the late 1970s by Ferid Murad and colleagues at the University of Virginia. Murad was working with azide (N3), a potent inhibitor of electron transport, and chanced to discover that the molecule stimulated cGMP production in cellular extracts. Murad and colleagues ultimately demonstrated that azide was being converted enzymatically into nitric oxide, which served as the actual guanylyl cyclase activator. These studies also explained the action of nitroglycerine, which had been used since the 1860s to treat the pain of angina that results from an inadequate flow of blood to the heart. Nitroglycerine is metabolized to nitric oxide, which stimulates the relaxation of the smooth muscles lining the blood vessels of the heart, increasing blood flow to the organ. The therapeutic benefits of nitroglycerine were discovered through an interesting observation. Persons with heart disease who worked with nitroglycerine in Alfred Nobel’s dynamite factory were found to suffer more from the pain of angina on days they weren’t at work. It is only fitting that the Nobel Prize, which is funded by a donation from Alfred Nobel’s estate, was awarded in 1998 for the discovery of NO as a signaling agent. Inhibiting Phosphodiesterase The discovery of NO as a second messenger has also led to the development of Viagra (sildenafil). During sexual arousal, nerve endings in the penis release NO, which causes relaxation of smooth muscle cells in the lining of penile blood vessels and engorgement of the organ with blood. As described above, NO mediates this response in smooth muscle cells by activation of the enzyme guanylyl cyclase and subsequent production of cGMP. Viagra (and related drugs) has no effect on the release of NO or the activation of guanylyl cyclase, but instead acts as an inhibitor of cGMP phosphodiesterase, the enzyme that destroys cGMP. Inhibition of this enzyme leads to maintained, elevated levels of cGMP, which promotes the development and maintenance of an erection. Viagra is quite specific for one particular isoform of cGMP phosphodiesterase, PDE5, which is the version that acts in the penis. Another isoform of the enzyme, PDE3, plays a key role in the regulation of heart muscle contraction, but fortunately is not inhibited by the drug. Viagra was discovered when a potential angina medication had unexpected side effects. Recent investigations have revealed that NO has a variety of actions within the body that do not involve production of

cGMP. For example, NO is added to the —SH group of certain cysteine residues in well over a hundred proteins, including hemoglobin, Ras, ryanodine channels, and caspases. This posttranslational modification, which is called S-nitrosylation, alters the activity, turnover, or interactions of the protein.

REVIEW 1. Describe the steps in the signaling pathway by which nitric oxide mediates dilation of blood vessels.

15.8 | Apoptosis (Programmed Cell Death) Apoptosis, or programmed cell death, is a normal process that is unique to animal cells. Apoptosis occurs through an orchestrated sequence of events that leads to the death of a cell. Death by apoptosis is a neat, orderly process (Figure 15.37) characterized by the overall shrinkage in volume of the cell and its nucleus, the loss of adhesion to neighboring cells, the formation of blebs at the cell surface, the dissection of the chromatin into small fragments, and the rapid engulfment of the “corpse” by phagocytosis. Apoptosis is often contrasted with a different type of cell death called necrosis, which generally follows some type of physical trauma or biochemical insult. Like apoptosis, necrosis is generally considered to be a regulated and programmed process, although much less orderly in nature. Necrosis is characterized by the swelling of both the cell and its internal membranous organelles, membrane breakdown, leakage of cell contents into the medium, and the resulting induction of inflammation. Because it is a safe and orderly process, apoptosis might be compared to the controlled implosion of a building using carefully placed explosives as compared to simply blowing up the structure without concern for what happens to the flying debris. Why do our bodies have unwanted cells, and where do we find cells that become targeted for elimination? The short answer is: almost anywhere you look. During embryonic development, the earliest form of the human hand resembles a paddle without any space between the tissues that will become the fingers. The fingers are essentially carved out of the paddle via the elimination of the excess cells by apoptosis. Three stages of this process as it occurs in mice are shown in Figure 15.38. T lymphocytes are cells of the immune system that recognize and kill abnormal or pathogen-infected target cells. These target cells are recognized by specific receptors that are present on the surfaces of T lymphocytes. During embryonic development, T lymphocytes are produced that possess receptors capable of binding tightly to proteins present on the surfaces of normal cells within the body. T lymphocytes that have this dangerous capability are eliminated by apoptosis (see Figure 17.25). Apoptosis does not stop with the end of embryonic development. It has been estimated that 1010–1011 cells in the adult body die every day by apoptosis. For example, apoptosis is involved in the elimination of cells that have sustained irreparable genomic damage. This is important because damage to the genetic blueprint can result in unregulated cell division and the

657

(a)

(b)

development of cancer. Apoptosis is also responsible for the death of cells that are no longer required, such as activated T cells that have responded to an infectious agent that has been eliminated. Finally, apoptosis appears to be involved in neurodegenerative diseases such as Alzheimer’s disease, Parkinson’s disease, and Huntington’s disease. Elimination of essential neurons during disease progression gives rise to loss of memory or decrease in motor coordination. These examples show that apoptosis is important in shaping tissues and organs during embryonic development and in maintaining homeostasis within the bodies of adult animals. Serious diseases can result from both the failure to carry out apoptosis when the elimination of cells (e.g., cancer cells) is appropriate and from the overactive induction of apoptosis when the elimination of cells is not appropriate (e.g., type 1 diabetes). The term apoptosis was coined in 1972 by John Kerr, Andrew Wylie, and A. R. Currie of the University of Aberdeen, Scotland, in a landmark paper that described for the first time the coordinated events that occurred during the programmed death of a wide range of cells. Insight into the molecular basis of apoptosis was first revealed in studies on the nematode worm C. elegans, whose cells can be followed with absolute precision during embryonic development. Of the 1090 cells produced during the development of this worm, 131 cells are normally destined to die by apoptosis. In 1986, Robert Horvitz and his colleagues at the Massachusetts Institute of Technology discovered that worms carrying a mutation in the CED-3 gene proceed through development without losing any of their cells to apoptosis. This finding suggested that the product of the CED-3 gene played a crucial role in the process of apoptosis in this organism. Once a gene has been identified in one organism, such as a nematode, researchers can search for homologous genes in other organisms, such as humans or other mammals. The identification of the CED-3 gene in nematodes led to the discovery of a homologous family of proteins in mammals, which are now called caspases. Caspases are a distinctive group of cysteine proteases (i.e., proteases with a key cysteine residue in their

Figure 15.37 A comparison of normal and apoptotic cells. (a,b) Scanning electron micrographs of a normal (a) and apoptotic (b) T-cell hybridoma. The apoptotic cell exhibits many surface blebs that are budded off in the cell. Bar equals 4 mm. (c) Transmission electron micrograph of an apoptotic cell treated with an inhibitor that arrests apoptosis at the membrane blebbing stage. (A,B: FROM S. J. MARTIN ET AL., TRENDS BIOCHEM. SCI. 19:28, 1994. REPRINTED WITH PERMISSION FROM ELSEVIER; C: COURTESY OF NICOLA J. MCCARTHY.)

Figure 15.38 Apoptosis carves out the structure of the mammalian digits. Three stages in this process in a mouse embryo. In this particular mouse, which is called a MacBlue mouse, all of the embryonic macrophages express the cyan fluorescent protein. Fluorescent macrophages have infiltrated the regions of the footpad where apoptosis has occurred and are clearing the space between the digits. (FROM DAVID A. HUME, NATURE IMMUNOL. 9:13, 2008; © 2008, REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

15.8 Apoptosis (Programmed Cell Death)

(c)

658

catalytic site) that are activated at an early stage of apoptosis and are responsible for triggering most, if not all, of the changes observed during cell death. Caspases accomplish this feat by cleaving a select group of essential proteins. Among the targets of caspases are the following: ■

■

■

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

■

More than a dozen protein kinases, including focal adhesion kinase (FAK ), PKB, PKC, and Raf1. Inactivation of FAK, for example, is presumed to disrupt cell adhesion, leading to detachment of the apoptotic cell from its neighbors. Inactivation of certain other kinases, such as PKB, serves to disrupt prosurvival signaling pathways. Caspases also disrupt the generation of survival signals by inactivating the NF-␬B pathway (page 660). Lamins, which make up the inner lining of the nuclear envelope. Cleavage of lamins leads to the disassembly of the nuclear lamina and shrinkage of the nucleus. Proteins of the cytoskeleton, such as those of intermediate filaments, actin, tubulin, and gelsolin. Cleavage and consequent inactivation of these proteins lead to changes in cell shape. An endonuclease called caspase activated DNase (CAD), which is activated following caspase cleavage of an inhibitory protein. Once activated, CAD translocates from the cytoplasm to the nucleus where it attacks DNA, severing it into fragments.

Recent studies have focused on the events that lead to the activation of a cell’s suicide program. Apoptosis can be triggered by both internal stimuli, such as abnormalities in the DNA, and external stimuli, such as certain cytokines (proteins secreted by cells of the immune system). For example, epithelial cells of the prostate become apoptotic when deprived of the male sex hormone testosterone. This is the reason why prostate cancer that has spread to other tissues is often treated with drugs that interfere with testosterone production. Studies indicate that external stimuli activate apoptosis by a signaling pathway, called the extrinsic pathway, that is distinct from that utilized by internal stimuli, which is called the intrinsic pathway. Here we will discuss the extrinsic and intrinsic pathways separately. However, it should be noted that there is cross-talk between these pathways and that extracellular apoptotic signals can cause activation of the intrinsic pathway.

The Extrinsic Pathway of Apoptosis The steps in the extrinsic pathway are illustrated in Figure 15.39. In the case depicted in the figure, the stimulus for apoptosis is carried by an extracellular messenger protein called tumor necrosis factor (TNF), which was named for its ability to kill tumor cells. TNF is a trimeric protein produced by certain cells of the immune system in response to adverse conditions, such as exposure to ionizing radiation, elevated temperature, viral infection, or toxic chemical agents such as those used in cancer chemotherapy. Like other types of first messengers discussed in this chapter, TNF evokes its response by binding to a transmembrane receptor, TNFR1. The trimeric TNFR1 protein is a member of a family of related

“death receptors” that turns on the apoptotic process. The cytoplasmic domain of each TNF receptor subunit contains a segment of about 70 amino acids called a “death domain” (each green segment in Figure 15.39) that mediates protein–protein interactions. Binding of TNF to the trimeric receptor produces a change in conformation of the receptor’s death domain, which leads to the recruitment of a number of proteins, as indicated in Figure 15.39. The last proteins to join the complex that assembles at the inner surface of the plasma membrane are two procaspase-8

TNF

TNFR1

Plasma membrane

Death domains

FADD

Procaspase-8

TRADD

Initiator caspase-8

Executioner procaspases

Executioner caspase

Figure 15.39 A simplified model of the extrinsic (receptormediated) pathway of apoptosis. When TNF binds to a TNF receptor (TNFR1), the activated receptor binds two different cytoplasmic adaptor proteins (TRADD and FADD) and procaspase-8 to form a multiprotein complex at the inner surface of the plasma membrane. The cytoplasmic domains of the TNF receptor, FADD, and TRADD interact with one another by homologous regions called death domains that are present in each protein (indicated as green boxes). Procaspase-8 and FADD interact by means of homologous regions called death effector domains (indicated as brown boxes). Once assembled in the complex, the two procaspase molecules cleave one another to generate an active caspase-8 molecule containing four polypeptide segments. Caspase-8 is an initiator complex that activates downstream (executioner) caspases that carry out the death sentence. It can be noted that the interaction between TNF and TNFR1 also activates other signaling pathways, one of which leads to cell survival rather than self-destruction (page 660).

659

molecules (Figure 15.39). These proteins are called “procaspases” because each is a precursor of a caspase; it contains an extra portion that must be removed by proteolytic processing to activate the enzyme. The synthesis of caspases as proenzymes protects the cell from accidental proteolytic damage. Unlike most proenzymes, procaspases exhibit a low level of proteolytic activity. According to one model, when two or more procaspases are held in close association with one another, as they are in Figure 15.39, they are capable of cleaving one another’s polypeptide chain and converting the other molecule to the fully active caspase. The final mature enzyme (i.e., caspase-8) contains four polypeptide chains, derived from two procaspase precursors as illustrated in the figure. Activation of caspase-8 is similar in principle to the activation of effectors by a hormone or growth factor. In all of these signaling pathways, the binding of an extracellular ligand causes a change in conformation of a receptor that leads to the binding and activation of proteins situated downstream in the pathway. Caspase-8 is described as an initiator caspase because it initiates apoptosis by cleaving and activating downstream, or executioner, caspases, that carry out the controlled self-destruction of the cell as described above.

The Intrinsic Pathway of Apoptosis

3

The first member of the family, Bcl-2 itself, was originally identified in 1985 as a cancer-causing oncogene in human lymphomas. The gene encoding Bcl-2 was overexpressed in these malignant cells as the result of a translocation. We now understand that Bcl-2 acts as an oncogene by promoting survival of potential cancer cells that would otherwise die by apoptosis.

Internal cellular damage

Activation of proapoptotic BH3-only protein and subsequent activation of Bak or Bax

Bax

Cytochrome c

+ Cytoplasmic factors (e.g. Apaf-1)

+ Procaspase-9

Activated initiator caspase-9 complex (Apoptosome)

Executioner procaspases

Executioner caspase

Figure 15.40 The intrinsic (mitochondria-mediated) pathway of apoptosis. Various types of cellular stress cause proapoptotic members of the Bcl-2 family of proteins–either Bax or Bak–to oligomerize within the outer mitochondrial membrane, forming channels that facilitate the release of cytochrome c molecules from the intermembrane space. Once in the cytosol, the cytochrome c molecules form a multisubunit complex with a cytosolic protein called Apaf-1 and procaspase-9 molecules. Procaspase-9 molecules are apparently activated to their full proteolytic capacity as the result of a conformational change induced by association with Apaf-1. Caspase-9 molecules cleave and activate executioner caspases, which carry out the apoptotic response. The intrinsic pathway can be triggered in some cells (e.g., hepatocytes) by extracellular signals. This occurs as the initiator caspase of the extrinsic pathway, caspase 8, cleaves a BH3-only protein called Bid, generating a protein fragment (tBid) that binds to Bax, inducing insertion of Bax into the OMM and release of cytochrome c from mitochondria.

15.8 Apoptosis (Programmed Cell Death)

Internal stimuli, such as irreparable genetic damage, lack of oxygen (hypoxia), extremely high concentrations of cytosolic Ca2⫹, viral infection, ER stress, or severe oxidative stress (i.e., the production of large numbers of destructive free radicals), trigger apoptosis by the intrinsic pathway illustrated in Figure 15.40. Activation of the intrinsic pathway is regulated by members of the Bcl-2 family of proteins, which are characterized by the presence of one or more small BH domains. Bcl-2 family members can be subdivided into three groups: (1) proapoptotic members (containing several BH domains) that promote apoptosis (Bax and Bak), (2) antiapoptotic members (containing several BH domains) that protect cells from apoptosis (e.g., Bcl-xL, Bcl-w, and Bcl-2),3 and (3) BH3-only proteins (so-named because they contain only one BH domain), which promote apoptosis by an indirect mechanism. According to the prevailing view, BH3-only proteins (e.g., Bid, Bad, Puma, and Bim) can exert their proapoptotic effect in two different ways, depending on the particular proteins involved. In some cases they promote apoptosis by inhibiting antiapoptotic Bcl-2 members, whereas in other cases they promote apoptosis by activating proapoptotic Bax or Bak. In either case, the BH3-only proteins are the likely determinants as to whether a cell follows a pathway of survival or death. In a healthy cell, the BH3-only proteins are either absent or strongly inhibited, and the antiapoptotic Bcl-2 proteins are able to restrain proapoptotic members. The mechanism by which this occurs is debated. It is only in the face of certain

types of stress that the BH3-only proteins are expressed or activated, thereby shifting the balance in the direction of apoptosis. In these circumstances, the restraining effects of the antiapoptotic Bcl-2 proteins are overridden, and the proapop-

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

660

totic protein Bax is free to translocate from the cytosol to the outer mitochondrial membrane (OMM). Although the mechanism is not entirely clear, it is thought that Bax molecules (and/or Bak molecules, which are permanent residents of the OMM) undergo a change in conformation that causes them to assemble into a multisubunit, protein-lined channel within the OMM. Once formed, this channel dramatically increases the permeability of the OMM and promotes the release of certain mitochondrial proteins, most notably cytochrome c, which resides in the intermembrane space (see Figure 5.17). Nearly all of the cytochrome c molecules present in all of a cell’s mitochondria can be released from an apoptotic cell in a period as short as five minutes. Cells lacking both Bax and Bak are protected from apoptosis, revealing the essential roles of these proapoptotic proteins in this process. Release of proapoptotic mitochondrial proteins such as cytochrome c is apparently the “point of no return,” that is, an event that irreversibly commits the cell to apoptosis. Once in the cytosol, cytochrome c forms part of a wheel-shaped multiprotein complex called the apoptosome, that also includes several molecules of procaspase-9. Procaspase-9 molecules are thought to become activated by simply joining the multiprotein complex and do not require proteolytic cleavage (Figure 15.40). Like caspase-8, which is activated by the receptor-mediated pathway described above, caspase-9 is an initiator caspase that activates downstream executioner caspases, which bring about apoptosis.4 The extrinsic (receptor-mediated) and intrinsic (mitochondria-mediated) pathways ultimately converge by activating the same executioner caspases, which cleave the same cellular targets. You might be wondering why cytochrome c, a component of the electron transport chain, and the mitochondrion, an organelle that functions as the cell’s power plant, would be involved in initiating apoptosis. There is no obvious answer to this question at the present time. The key role of mitochondria in apoptosis is even more perplexing when one considers that these organelles have evolved from prokaryotic endosymbionts and that prokaryotes do not undergo apoptosis. As cells execute the apoptotic program, they lose contact with their neighbors and start to shrink. Finally, the cell disintegrates into a condensed, membrane-enclosed apoptotic body. This entire apoptotic program can be executed in less than an hour. The apoptotic bodies are recognized by the presence of phosphatidylserine on their surface. Phosphatidylserine is a phospholipid that is normally present only on the inner leaflet of the plasma membrane. During apoptosis, a phospholipid “scramblase” moves phosphatidylserine molecules to the outer leaflet of the plasma membrane where they are recognized as an “eat me” signal by specialized macrophages. Apoptotic cell death thus occurs without spilling cellular content into the extracellular environment (Figure 15.41). This is important because the release of cellu4

Other intrinsic pathways that are independent of Apaf-1 and caspase-9, and possibly independent of cytochrome c, have also been described.

Apoptotic cell Condensed, fragmented chromatin

Cytoplasm of macrophage

Figure 15.41 Clearance of apoptotic cells is accomplished by phagocytosis. This electron micrograph shows an apoptotic cell “corpse” within the cytoplasm of a phagocyte. Note the compact nature of the engulfed cell and the dense state of its chromatin. (FROM PETER M. HENSON, DONNA L. BRATTON, AND VALERIE A. FADOK, CURR. BIOL. 11:R796, 2001, FIG. 1A, © 2001, WITH PERMISSION FROM ELSEVIER.)

lar debris can trigger inflammation, which can cause a significant amount of tissue damage. Signaling Cell Survival Just as there are signals that commit a cell to self-destruction, there are opposing signals that maintain cell survival. In fact, interaction of TNF with a TNF receptor often transmits two distinct and opposing signals into the cell interior: one stimulating apoptosis and another stimulating cell survival. As a result, most cells that possess TNF receptors do not undergo apoptosis when treated with TNF. This was a disappointing finding because it was originally hoped that TNF could be used as an agent to kill tumor cells. Cell survival is typically mediated through the activation of a key transcription factor called NF-␬B, which activates the expression of genes encoding cell-survival proteins. It would appear that the fate of a cell—whether survival or death— depends on a delicate balance between proapoptotic and antiapoptotic signals.

REVIEW 1. What are some of the functions of apoptosis in vertebrate biology? Describe the steps that occur between (a) the time that a TNF molecule binds to its receptor and the eventual death of the cell and (b) the time a proapoptotic Bcl-2 member binds to the outer mitochondrial membrane and the death of the cell. 2. What is the role of the formation of caspase-containing complexes in the process of apoptosis?

661

| Synopsis Utilization of glucose is controlled by a signaling pathway that begins with an activated GPCR. The breakdown of glycogen into glucose is stimulated by the hormones epinephrine and glucagon, which act as first messengers by binding to their respective receptors on the outer surface of target cells. Binding of the hormones activates an effector, adenylyl cyclase, on the membrane’s inner surface, leading to the production of the diffusible second messenger cyclic AMP (cAMP). Cyclic AMP evokes its response through a reaction cascade in which a series of enzymes are covalently modified. Cyclic AMP molecules bind to the regulatory subunits of a cAMP-dependent protein kinase called PKA, which phosphorylates phosphorylase kinase and glycogen synthase, leading to the activation of the former enzyme and inhibition of the latter one. The activated phosphorylase kinase molecules add phosphates to glycogen phosphorylase, activating the latter enzyme, leading to the breakdown of glycogen to glucose 1-phosphate, which is converted to glucose. As a result of this reaction cascade, the original message—as delivered by the binding of the hormone at the cell surface—is greatly amplified, and the response time is greatly reduced. Reaction cascades of this type also provide varied sites of regulation. The addition of phosphate groups by kinases is reversed by phosphatases that remove the phosphates. Cyclic AMP is produced in many different cells in response to a wide variety of first messengers. The course of events that occurs in the target cell depends on the specific proteins phosphorylated by the cAMP-dependent kinase. (p. 631) Many extracellular stimuli initiate a cellular response by binding to the extracellular domain of a receptor protein-tyrosine kinase (RTK), which activates the tyrosine kinase domain located at the inner surface of the plasma membrane. RTKs regulate diverse functions, including cell growth and proliferation, the course of cell differentiation, the uptake of foreign particles, and cell survival. The best-studied growth-stimulating ligands, such as PDGF, EGF, and FGF, activate a signaling pathway called the MAP kinase cascade that includes a small monomeric GTP-binding protein called Ras. Like other G proteins, Ras cycles between an inactive GDP-bound form and an active GTP-bound form. In its active form, Ras stimulates effectors that lie downstream in the signaling pathway. Like other G proteins, Ras has a GTPase activity (stimulated by a GAP) that hydrolyzes the bound GTP to form bound GDP, thus switching itself off. When a ligand binds to the RTK, trans-autophosphorylation of the cytoplasmic domain of the receptor leads to the recruitment of Sos, an activator of Ras, to the inner surface of the membrane. Sos catalyzes the exchange of GDP for GTP, thus activating Ras. Activated Ras has an increased affinity for another protein called Raf, which is recruited to the plasma membrane where it becomes an active protein kinase that initiates an orderly chain of phosphorylation reactions that are outlined in Figure 15.20. The ultimate targets of the MAP kinase cascade are transcription factors that stimulate the expression of genes whose products play a key role in activating the cell cycle, leading to the initiation of DNA synthesis and cell division. The MAP kinase cascade is found in all eukaryotes from yeast to mammals, though it has adapted through evolution to evoke different responses in different types of cells. (p. 636) Insulin mediates many of its actions on target cells through interaction with the insulin receptor, which is an RTK. The activated kinase adds phosphate groups to tyrosine residues located on both the receptor and on receptor-associated docking proteins called IRSs. The phosphorylated tyrosine residues in an IRS serve as docking sites for proteins that have SH2 domains, which become activated upon binding the IRS. Several separate signaling pathways

Synopsis

Cell signaling is a phenomenon in which information is relayed across the plasma membrane to the cell interior and often to the cell nucleus. Cell signaling typically includes recognition of the stimulus at the outer surface of the plasma membrane; transfer of the signal across the plasma membrane; and transmission of the signal to the cell interior, triggering a response. Responses may include a change in gene expression, an alteration of the activity of metabolic enzymes, a reconfiguration of the cytoskeleton, a change in ion permeability, the activation of DNA synthesis, or the death of the cell. This process is often called signal transduction. Within the cell, information passes along signaling pathways, which often include protein kinases and protein phosphatases that activate or inhibit their substrates through changes in conformation. Another prominent feature of signaling pathways is the involvement of GTP-binding proteins that serve as switches that turn a pathway on or off. (p. 618) Many extracellular stimuli (first messengers) initiate responses by interacting with a G protein-coupled receptor (GPCR) on the outer cell surface and stimulating the release of a second messenger within the cell. Many extracellular messenger molecules act by binding to receptors that are integral membrane proteins containing seven membrane-spanning ␣ helices (GPCRs). The signal is transmitted from the receptor to the effector by a heterotrimeric G protein. These proteins are referred to as heterotrimeric because they have three subunits (␣, ␤, and ␥) and as G proteins because they bind guanine nucleotides, either GDP or GTP. Each G protein can exist in two states: an active state with a bound GTP or an inactive state with a bound GDP. Hundreds of different G protein-coupled receptors have been identified that respond to a wide variety of stimuli. All of these receptors act through a similar mechanism. The binding of the ligand to its specific receptor causes a change in the conformation of the receptor that increases its affinity for the G protein. As a result, the ligandbound receptor binds to the G protein, causing the G protein to release its bound GDP and bind a GTP replacement, which switches the G protein into the active state. Exchange of guanine nucleotides changes the conformation of the G␣ subunit, causing it to dissociate from the other two subunits, which remain together as a G␤␥ complex. Each dissociated G␣ subunit with its attached GTP can activate specific effector molecules, such as adenylyl cyclase. The dissociated G␣ subunit is also a GTPase and, with the help of an accessory protein, hydrolyzes the bound GTP to form bound GDP, which shuts off the subunit’s ability to activate effector molecules. The G␣-GDP then reassociates with the G␤␥ subunits to reform the trimeric complex and return the system to the resting state. Each of the three subunits that make up a heterotrimeric G protein can exist in different isoforms. Different combinations of specific subunits compose G proteins having different properties in their interactions with both receptors and effectors. (p. 621) Phospholipase C is another important effector on the inner surface of the plasma membrane that can be activated by heterotrimeric G proteins. PI-phospholipase C splits phosphatidylinositol 4,5bisphosphate (PIP2) into two different second messengers, inositol 1,4,5-trisphosphate (IP3) and 1,2-diacylglycerol (DAG). DAG remains in the plasma membrane where it activates the enzyme protein kinase C, which phosphorylates serine and threonine residues on a variety of target proteins. Constitutive activation of protein kinase C leads to loss of growth control. IP3 is a small, water-soluble molecule that can diffuse into the cytoplasm where it binds to IP3 receptors located on the surface of the smooth ER. IP3 receptors are tetrameric calcium ion channels; binding of IP3 leads to an opening of the ion channels and diffusion of Ca2⫹ into the cytosol. (p. 629)

Chapter 15 Cell Signaling and Signal Transduction: Communication Between Cells

662 may become activated as the result of different signaling proteins binding to the phosphorylated IRS. One pathway may stimulate DNA synthesis and cell division, another may stimulate the movement of glucose transporters to the cell membrane, and yet another may activate transcription factors that turn on the expression of insulin-specific genes. (p. 644) The rapid elevation of cytosolic Ca2ⴙ, whether brought about by the opening of ion channels in cytoplasmic membranes or in the plasma membrane, triggers a wide variety of cellular responses. The concentration of Ca2⫹ ions within the cytosol is normally maintained at about 10⫺7 M by the action of Ca2⫹ pumps located in the plasma membrane and the SER membrane. Many different stimuli— ranging from a fertilizing sperm to a nerve impulse arriving at a muscle cell—cause a sudden rise in cytosolic [Ca2⫹], which may follow the opening of the plasma membrane Ca2⫹ channels, IP3 receptors, or ryanodine receptors, which are a different type of calcium channel located in the SER membrane. Depending on the type of cell, ryanodine channels may be opened by an action potential arriving at the cell, or by the influx of a small amount of Ca2⫹ through the plasma membrane. Among the responses, elevated cytosolic [Ca2⫹] can lead to the activation or inhibition of various enzymes and transport systems, membrane fusion, or alterations in cytoskeletal or contractile functions. Calcium does not act on these various targets in the free ionic state, but instead it binds to one of a small number of calciumbinding proteins, which in turn elicit the response. The most widespread of these proteins is calmodulin, which contains four calcium-binding sites. The calcium ion is also an important intracellular messenger in plant cells, where it mediates responses to a variety of stimuli, including changes in light, pressure, gravity, and the concentration of plant hormones such as abscisic acid. (p. 648)

Different signaling pathways are often interconnected. As a result, signals from a variety of unrelated ligands can converge to activate a common effector, such as Ras; signals from the same ligand can diverge to activate a variety of different effectors; and signals can be passed back and forth between different pathways (cross-talk). (p. 653) Nitric oxide (NO) acts as an intercellular messenger that diffuses directly through the plasma membrane of the target cell. Included among the activities stimulated by NO is the relaxation of the smooth muscle cells that line blood vessels. NO is produced by the enzyme nitric oxide synthase, which uses arginine as a substrate. NO often functions by activating guanylyl cyclase to produce the second messenger cGMP. (p. 655) Signaling pathways may end in apoptosis—the programmed death of the cell. Examples of apoptosis include the death of cells between the developing digits of the hand, the death of T lymphocytes that react with the body’s own tissues, and the death of potential cancer cells. Death by apoptosis is characterized by the overall compaction of the cell and its nucleus and the orderly dissection of the chromatin by special endonucleases. Apoptosis is mediated by proteolytic enzymes called caspases that activate or deactivate key protein substrates by removing a portion of their polypeptide chain. Two distinct pathways of apoptosis have been identified, one triggered by extracellular stimuli acting through death receptors, such as TNFR1, and the other triggered by internal cellular stress acting through the release of cytochrome c from the intermembrane space of mitochondria and the activation of proapoptotic members of the Bcl-2 protein family. (p. 656)

| Analytic Questions 1. The subject of cell signaling was placed near the end of the book

because it ties together so many different topics in cell biology. Now that you have read the chapter, would you agree or disagree with this statement? Support your conclusions by example. 2. Suppose the signaling pathway in Figure 15.3 were to lead to

the activation of a gene that inhibits a cyclin-dependent kinase responsible for moving a cell into S phase of the cell cycle. How would a debilitating mutation in protein kinase 3 affect the cell’s growth? 3. What might be the effect on liver function of a mutation in a

gene that encodes a cAMP phosphodiesterase? of a mutation in a gene encoding a glucagon receptor? of a mutation in a gene encoding phosphorylase kinase? of a mutation that altered the active site of the GTPase of a G␣ subunit? (Assume in all cases that the mutation causes a loss of function of the gene product.) 4. Ca2⫹, IP3, and cAMP have all been described as second mes-

sengers. In what ways are their mechanisms of action similar? In what ways are they different? 5. In the reaction cascade illustrated in Figure 15.22, which steps

lead to amplification and which do not? 6. Suppose that epinephrine and norepinephrine could initiate a

similar response in a particular target cell. How could you determine whether or not the two compounds act by binding to the same cell-surface receptor? 7. One of the key experiments to show that gap junctions (page

262) allowed the passage of small molecules was carried out by allowing cardiac muscle cells (which respond to norepinephrine

by contraction) to form gap junctions with ovarian granulosa cells (which respond to FSH by undergoing various metabolic changes). The researchers then added FSH to the mixed cell culture and observed the contraction of the muscle cells. How could muscle cells respond to FSH, and what does this tell you about the structure and function of gap junctions? 8. How would you expect a GTP analogue that the cell could not hydrolyze (a nonhydrolyzable analogue) to affect signaling events that take place during the stimulation of a liver cell by glucagon? What would be the effect of the same analogue on signal transduction of an epithelial cell after exposure to epidermal growth factor (EGF)? How would this compare to the effects of the cholera toxin (page 627) on these same cells? 9. You suspect that phosphatidylcholine might be serving as a precursor for a second messenger that triggers secretion of a hormone in a type of cultured endocrine cell that you are studying. Furthermore, you suspect that the second messenger released by the plasma membrane in response to a stimulus is choline phosphate. What type of experiment might you perform to verify your hypothesis? 10. Figure 15.27 shows the localized changes in [Ca2⫹] within the dendritic tree of a Purkinje cell. Calcium ions are small, rapidly diffusible agents. How is it possible for a cell to maintain different concentrations of this free ion in different regions of its cytosol? What do you suspect would happen if you injected a small volume of a calcium chloride solution into one region of a cell that had been injected previously with a fluorescent calcium probe?

663 11. Formulate a hypothesis that might explain how contact of the

12.

13.

14.

15.

16.

17.

outer surface of an egg by a fertilizing sperm at one site causes a wave of Ca2⫹ release that spreads through the entire egg, as shown in Figure 15.29. Because calmodulin activates many different effectors (e.g., protein kinases, phosphodiesterases, calcium transport proteins), a calmodulin molecule must have many different binding sites on its surface. Would you agree with this statement? Why or why not? Diabetes mellitus is a disease that can result from a number of different defects involving insulin function. Describe three different molecular abnormalities in a liver cell that could cause different patients to exhibit a similar clinical picture, including, for example, high concentrations of glucose in the blood and urine. Would you expect a cell’s response to EGF to be more sensitive to the fluidity of the plasma membrane than its response to insulin? Why or why not? Would you expect a mutation in Ras to act dominantly or recessively as a cause of cancer? Why? (A dominant mutation causes its effect when only one of the homologous alleles is mutated, whereas a recessive mutation requires that both alleles of the gene are mutated.) Speculate on a mechanism by which apoptosis might play a crucial role in combating the development of cancer, a topic discussed in the following chapter. You are working with a type of fibroblast that normally responds to epidermal growth factor by increasing its rate of growth and division, and to epinephrine by lowering its rate of growth and division. You have determined that both of these responses require the MAP kinase pathway, and that EGF acts by means of an RTK and epinephrine by means of a G protein-coupled receptor. Suppose you identified a mutant strain of these cells that can still respond to EGF but is no longer inhibited by epinephrine. You suspect that the mutation is affecting the cross-talk between two pathways (shown in Figure 15.35). Which component in this figure might be affected by such a mutation?

18. In what way is the calcium wave that occurs at fertilization sim-

ilar to a nerve impulse that travels down a neuron? 19. Now that you have read the section on taste perception, why do 20.

21.

22.

23.

24.

25.

you suppose it has been difficult to find effective rat poisons? One of the genes of the cowpox virus encodes a protein called CrmA that is a potent inhibitor of caspases. What effect would you expect this inhibitor to have on an infected cell? Why is this advantageous to the infecting virus? Most RTKs act directly on downstream effectors, whereas the insulin RTK acts through an intermediate docking protein, an insulin receptor substrate (IRS). Are there any advantages in signaling that might accrue from the use of these IRS intermediates? Researchers have reported that (1) most of the physiologic effects of insulin on target cells can be blocked by incubation of cells with wortmannin, a compound that specifically inhibits the enzyme PI3K and (2) that causing cells to overexpress a constitutively active form of PKB (i.e., a form of the enzyme that is continually active regardless of circumstances) induces a response in cells that is virtually identical to addition of insulin to these cells. Looking at Figure 15.25, are these observations ones that you might have predicted? Why or why not? Knockout mice that are unable to produce caspase-9 die as a result of a number of defects, most notably a greatly enlarged brain. Why would these mice have this phenotype? How would you expect the phenotype of a cytochrome c knockout mice to compare with that of the caspase-9 knockout? Why do you suppose that some people find a compound called PROP to have a bitter taste, whereas others do not report this perception? The inhibition of a specific protein kinase often leads to an increased phosphorylation of many cellular proteins. How can you explain this observation?

Analytic Questions

664

16 Cancer 16.1 16.2 16.3 16.4

Basic Properties of a Cancer Cell The Causes of Cancer The Genetics of Cancer New Strategies for Combating Cancer EXPERIMENTAL PATHWAYS: The Discovery of Oncogenes

Cancer is a genetic disease because it can be traced to alterations within specific genes, but in most cases, it is not an inherited disease. In an inherited disease, the genetic defect is present in the chromosomes of a parent and is transmitted to the zygote. In contrast, the genetic alterations that lead to most cancers arise in the DNA of a somatic cell during the lifetime of the affected individual. Because of these genetic changes, cancer cells become freed from many of the restraints to which normal cells are subjected. Normal cells do not divide unless they are stimulated to do so by the body’s homeostatic machinery; nor do they survive if they have incurred irreparable damage; nor do they wander away from a tissue to start new colonies elsewhere in the body. In contrast, most cancer cells experience a breakdown in all of these regulatory influences that protect the body from chaos and self-destruction. Most importantly, cancer cells proliferate uncontrollably, producing malignant tumors that invade surrounding healthy tissue (Figure 16.1). As long as the growth of the tumor remains localized, the disease can usually be treated and cured by surgical removal of the tumor. But malignant tumors tend to metastasize, that is, to spawn renegade cells that break away from the parent mass, enter the lymphatic or vascular circulation, and spread to distant sites in the body where they establish lethal secondary tumors ( metastases) that are no longer amenable to surgical removal. The subject of metastasis is discussed in the Human Perspective of Chapter 7 on page 256.

Skull with Cigarette. (BY VINCENT VAN GOGH, 1885, VAN GOGH MUSEUM, AMSTERDAM/© ART RESOURCE,

NY.)

665 Because of its impact on human health and the hope that a cure might be developed, cancer has been the focus of a massive research effort for decades. Though these studies have led to a remarkable breakthrough in our understanding of the cellular and molecular basis of cancer, they have not had a major impact on either preventing the occurrence of or increasing the chances of surviving most cancers. There has been progress, however. In 2011, the American Association for Cancer Research reported that death rates for all cancers combined dropped during the years between 1990 and 2007 by 22% for men and 14% for women. Much of this progress is attributed to earlier diagnosis and treatment of three major types of cancer: breast cancer, prostate cancer, and colon cancer. The incidence of various types of cancer in the United States and the corresponding mortality rates are shown in Figure 16.2. Most current treatments, such as chemotherapy and radiation, lack the specificity needed to kill cancer cells without simultaneously damaging normal cells, as evidenced by the serious side effects that accompany these treatments. As a result, patients cannot usually be subjected to high enough doses of chemicals or radiation to kill all of the tumor cells in their body. Researchers have been working for many years to develop more effective and less debilitating targeted therapies. Some of these newer strategies in cancer therapy will be discussed at the end of the chapter.

Figure 16.1 The invasion of normal tissue by a growing tumor. This light micrograph of a section of human liver shows a metastasized melanosarcoma (in red) that is invading the normal liver tissue. (ASTRID AND HANNS-FRIEDER MICHLER/PHOTO RESEARCHERS, INC.)

80 70

U.S. Cancer Cases Per 100,000 Population

60

U.S. Cancer Deaths Per 100,000 Population

50 40 30 20 10

The behavior of cancer cells is most easily studied when the cells are growing in culture. Cancer cells can be obtained by removing a malignant tumor, dissociating the tissue into its separate cells, and culturing the cells in vitro. Over the years, many different lines of cultured cells that were originally derived from human tumors have been collected in cell banks and are available for study. Alternatively, normal cells can be converted to cancer cells by treatment with carcinogenic chemicals, radiation, or tumor viruses. Cells that have been

Soft tissue

Ovarian

Figure 16.2 The incidence of new cancer cases and deaths in the United States in 2010. In 2010, there were 1,529,560 reported new cancer cases and 569,490 cancer deaths. (DATA FROM THE AMERICAN CANCER SOCIETY, INC.)

transformed in vitro by chemicals or viruses can generally cause tumors when introduced into a suitable host animal. There are many differences in properties from one type of cancer cell to another. At the same time, there are a number of basic properties that are shared by cancer cells, regardless of their tissue of origin. At the cellular level, the most important characteristic of a cancer cell—whether residing in the body or on a culture dish—is its loss of growth control. The capacity for growth and division is not dramatically different between a cancer cell and most normal cells. When normal cells are grown in tissue culture under conditions that promote cell proliferation, they

16.1 Basic Properties of a Cancer Cell

16.1 | Basic Properties of a Cancer Cell

Brain and nervous

Pancreatic

Leukemia

Kidney

Melanoma

Bladder

Lymphoma

Colorectal

Breast

Prostate

Lung and bronchus

0

666 Normal cells

Normal cells grow in monolayer (a)

(b)

Cancer cells

Chapter 16 Cancer

(d)

grow and divide at a rate similar to that of their malignant counterparts. However, when the normal cells proliferate to the point where they cover the bottom of the culture dish, their growth rate decreases markedly, and they tend to remain as a single layer (monolayer) of cells (Figure 16.3a,b). Growth rates drop as normal cells respond to inhibitory influences from their environment. Growth-inhibiting influences may arise as the result of depletion of growth factors in the culture medium or from contact with surrounding cells on the dish. In contrast, when malignant cells are cultured under the same conditions, they continue to grow, piling on top of one another to form clumps (Figure 16.3c,d ). It is evident that malignant cells are not responsive to the types of signals that cause their normal counterparts to cease growth and division. Not only do cancer cells ignore inhibitory growth signals, they continue to grow in the absence of stimulatory growth signals that are required by normal cells. Normal cells growing in culture depend on growth factors, such as epidermal growth factor and insulin, that are present in serum (the fluid fraction of blood), which is usually added to the growth medium (Figure 16.4). Cancer cells can proliferate in the absence of serum because their cell cycle does not depend on the interaction between growth factors and their receptors, which are located at the cell surface (page 636). As we will see below, this transformation is a result of basic changes in the intracellular pathways that govern cell proliferation and survival. Normal cells growing in culture exhibit a limited capacity for cell division; after a finite number of mitotic divisions, they undergo an aging process that renders them unfit to continue to grow and divide (page 508). Cancer cells, on the other hand, are seemingly immortal because they continue to divide indefinitely. This difference in growth potential is often attributed to the presence of telomerase in cancer cells and its absence in normal cells. Recall from page 507 that telomerase

is the enzyme that maintains the telomeres at the ends of the chromosomes, thus allowing cells to continue to divide. The absence of telomerase from most types of normal cells is thought to be one of the body’s major defenses that protects against tumor growth. The most striking alterations in the nucleus following transformation occur within the chromosomes. Unlike normal cells that replicate their DNA at a very low error rate (page 562), cancer cells are genetically unstable and often have highly aberrant chromosome complements, a condition termed aneuploidy (Figure 16.5), which may occur primarily as a result of defects in the mitotic checkpoint (page 596) or the presence of an abnormal number of centrosomes (see Figure

Cell number

Cancer cells grow in clumps (foci) (c)

Figure 16.3 Growth properties of normal and cancerous cells. Normal cells typically grow in a culture dish until they cover the surface as a monolayer (a and b). In contrast, cells that have been transformed by viruses or carcinogenic chemicals (or malignant cells that have been cultured from tumors) typically grow in multilayered clumps, or foci (c and d ). (B AND D: COURTESY OF G. STEVEN MARTIN, UNIVERSITY OF CALIFORNIA AT BERKELEY.)

Cancer cells + serum growth factors

Cancer cells – serum growth factors

Normal cells – serum growth factors Normal cells + serum growth factors

1

2

3

4

Time in culture (days)

Figure 16.4 The effects of serum deprivation on the growth of normal and transformed cells. Whereas the growth of cancer cells continues regardless of the presence or absence of exogenous growth factors, normal cells require these substances in their medium for growth to continue. The growth of normal cells levels off as the growth factors in the medium are depleted.

667

Figure 16.5 Karyotype of a cell from a breast cancer line showing a highly abnormal chromosome complement. A normal diploid cell would have 22 pairs of autosomes and two sex chromosomes. The two members of a pair would be identical, and each chromosome would be a single continuous color (as in the karyotype of a normal cell in Figure 12.22b that uses a similar spectral visualization technique). The chromosomes of this cell are highly deranged as evidenced by the

presence of extra and missing chromosomes (i.e., aneuploidy) as well as chromosomes of more than one color. These multicolored chromosomes reflect the large numbers of translocations that have occurred in previous cell generations. A cell with normal cell cycle checkpoints and apoptotic pathways could never have attained a chromosome complement approaching that seen here. (COURTESY OF JOANNE DAVIDSON AND PAUL A. W. EDWARDS.)

14.17c).1 It is evident from Figure 16.5 that the growth of cancer cells is much less dependent on a standard diploid chromosome content than the growth of normal cells. In fact, when the chromosome content of a normal cell becomes disturbed, a signaling pathway is usually activated that leads to the self-destruction (apoptosis) of the cell. In contrast, cancer cells typically fail to elicit the apoptotic response even when their chromosome content becomes highly deranged. Protection from apoptosis is another important hallmark that distinguishes many cancer cells from normal cells. Finally, it can be noted that cancer cells often depend on glycolysis, which is considered an anaerobic metabolic pathway (Figure 3.24). This property may reflect the high metabolic requirements of cancer cells and an inadequate blood supply within the tumor. Under conditions of hypoxia (reduced O2 ), cancer cells activate a transcription factor called HIF that induces the formation of new blood vessels and promotes the migratory properties of the cells, which may contribute to the spread of the tumor. However, even when oxygen is plentiful, many tumor cells continue to generate much of their ATP by glycolysis (called aerobic glycolysis). Even though glycolysis generates much less ATP per glucose than does oxidative phosphorylation in the mitochon-

drion, it produces ATP at a more rapid rate. The increased uptake of glucose by tumor cells compared to normal cells can be used as a means to locate metastatic tumors within the body using PET scans (see Figure 16.23). It is these properties, which can be demonstrated in culture, together with their tendency to spread to distant sites within the body, that make cancer cells such a threat to the well-being of the entire organism.

There is controversy as to whether the development of aneuploidy occurs at an early stage in tumor formation and is a cause of the genetic instability that characterizes cancer cells, or is a late event and is simply a consequence of abnormal cancer growth.

1. Describe some of the properties that distinguish cancer cells from normal cells. 2. How do the properties of cancer cells manifest themselves in culture?

16.2 | The Causes of Cancer In 1775, Percivall Pott, a British surgeon, made the first known correlation between an environmental agent and the development of cancer. Pott concluded that the high incidence of cancer of the nasal cavity and the skin of the scrotum in chimney sweeps was due to their chronic exposure to soot. Within the past several decades, the carcinogenic chemicals in soot have been isolated, along with hundreds of other com-

16.2 The Causes of Cancer

1

REVIEW

Chapter 16 Cancer

668

are obvious: smoking causes lung cancer, exposure to ultraviolet radiation causes skin cancer, and inhaling asbestos fibers causes mesothelioma. But despite a large number of studies, we are still uncertain as to the causes of most types of human cancer. Humans live in complex environments and are exposed to many potential carcinogens in a changing pattern over a period of decades. Attempting to determine the causes of cancer from a mountain of statistical data obtained from the answers to questionnaires about individual lifestyles has proven very difficult. The importance of environmental factors (e.g., diet) is seen most clearly in studies of the children of couples that have moved from Asia to the United States or Europe. These individuals no longer exhibit a high rate of gastric cancer, as occurs in Asia, but instead are subject to an elevated risk of colon and breast cancer, which is characteristic of Western countries (Figure 16.6). There is a general consensus among epidemiologists that diet can play a major role in the risk of developing cancer. Cancer rates are higher among obese individuals than the non-obese population and studies in primates suggests that a calorie-restricted diet (page 647) protects against cancer. Recent attention has focused on elevated levels of insulin and insulin-like growth factor (IGF-1) that are found in obese individuals as being a primary cause of the increased cancer incidence in this group. There is also evidence that some ingredients in the diet, such as animal fat and alcohol, can increase the risk of developing cancer, whereas certain compounds found in food items may reduce that risk. Examples of the latter include isoflavones found in soy, sul-

100

100

80

80

80

60

40

20

0

Incidence per 100,000

100

Incidence per 100,000

Incidence per 100,000

pounds shown to cause cancer in laboratory animals. In addition to a diverse array of chemicals, a number of other types of agents are also carcinogenic, including ionizing radiation and a variety of DNA- and RNA-containing viruses. All of these agents have one property in common: they alter the genome. Carcinogenic chemicals, such as those present in soot or cigarette smoke, can almost always be shown either to be directly mutagenic or to be converted to mutagenic compounds by cellular enzymes. Similarly, ultraviolet radiation, which is the leading cause of skin cancer, is also strongly mutagenic. A number of viruses can infect mammalian cells growing in cell culture, transforming them into cancer cells. These viruses are broadly divided into two large groups: DNA tumor viruses and RNA tumor viruses, depending on the type of nucleic acid found within the mature virus particle. Among the DNA viruses capable of transforming cells are polyoma virus, simian virus 40 (SV40), adenovirus, and herpes-like viruses. RNA tumor viruses, or retroviruses, are similar in structure to HIV (see Figure 1.22b) and are the subject of the Experimental Pathways, which can be found at the end of the chapter. Tumor viruses can transform cells because they carry genes whose products interfere with the cell’s normal growthregulating activities. Although tumor viruses were an invaluable tool for researchers in identifying numerous genes involved in cell transformation, they are associated with only a small number of human cancers. Other types of viruses are, however, linked to as many as 20 percent of cancers worldwide. In most cases, these viruses greatly increase a person’s risk of developing the cancer, rather than being the sole determinant responsible for the disease. This relationship between viral infection and cancer is illustrated by human papilloma virus (HPV), which can be transmitted through sexual activity and is increasing in frequency in the population. Although the virus is present in about 90 percent of cervical cancers, indicating its importance in development of the disease, the vast majority of women who have been infected with the virus will never develop this malignancy. HPV is also linked as a primary causative agent of cancers of the mouth and tongue in both men and women. Effective vaccines against this virus are now available. Other viruses linked to human cancers include hepatitis B virus, which is associated with liver cancer; Epstein-Barr virus, which is associated with Burkitt’s lymphoma in areas where malaria is common; and a herpes virus (HHV-8), which is associated with Kaposi’s sarcoma. Certain gastric lymphomas are associated with chronic infection by the stomach-dwelling bacterium Helicobacter pylori, which can also cause ulcers. Recent evidence suggests that many of these cancers linked to persistent viral and bacterial infections are actually caused by the chronic inflammation that is triggered by the presence of the pathogen. Inflammatory bowel disease (IBD), which is also characterized by chronic inflammation, has been associated with an increased risk of colon cancer. These findings have caused researchers to look more closely at the general process of inflammation as a previously unexplored factor in the development of many types of cancers. Determining the causes of different types of cancer is an endeavor carried out by epidemiologists, researchers who study disease patterns in populations. The causes of certain cancers

60

40

20

0

Stomach (male)

60

40

20

0

Breast (female)

Colon (male)

Japanese

Second-generation migrants

First-generation migrants

Caucasian Hawaiians

Figure 16.6 Changing cancer incidence in persons of Japanese descent following migration to Hawaii. The incidence of stomach cancer declines, whereas that of breast and colon cancer rises. However, of the three types of cancer, only colon cancer has reached rates equivalent to Caucasian Hawaiians by the second generation. (FROM L. N. KOLONEL ET AL., REPRINTED WITH PERMISSION FROM NATURE REVS. CANCER 4:3, 2004; COPYRIGHT 2004. NATURE REVIEWS CANCER BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

669

foraphanes found in broccoli, and EGCG found in tea. Several widely prescribed drugs also have a preventive effect. Drugs that interfere with the action of estrogen (e.g., tamoxifen or raloxifene) or the metabolism of testosterone (e.g., finasteride) can reduce the incidence of breast cancer or prostate cancer, respectively. Long-term use of nonsteroidal anti-inflammatory drugs (NSAIDs) such as aspirin and indomethacin has been shown to markedly decrease the risk of colon cancer. They are thought to have this effect by inhibiting cyclooxygenase-2, an enzyme that catalyzes the synthesis of hormone-like prostaglandins, which promote the growth of intestinal polyps. The cancer-suppressing action of NSAIDs supports the idea that inflammation plays a major role in the development of various cancers. Persons who have taken the antidiabetes drug metformin also appear to have a significantly reduced risk of developing cancer. In this case, the benefit may be a result of the drug’s action in lowering the circulating levels of insulin and IGF-1 (page 647).

16.3 | The Genetics of Cancer

16.3 The Genetics of Cancer

Cancer is one of the two leading causes of death in Western countries, afflicting approximately one in every three individuals. Viewed in this way, cancer is a very common disease. But at the cellular level, the development of a cancer is a remarkably rare event. Whenever the cells of a cancerous tumor are genetically scrutinized, they are invariably found to have arisen from a single cell. Thus, unlike other diseases that require modification of a large number of cells, cancer results from the uncontrolled proliferation of a single wayward cell. (Cancer is said to be monoclonal ). Consider for a moment that the human body contains trillions of cells, billions of which undergo cell division on any given day. Though almost any one of these dividing cells may have the potential to change in genetic composition and grow into a malignant tumor, this only occurs in about one-third of the human population during an entire lifetime. One of the primary reasons why a greater number of cells do not give rise to cancerous tumors is that malignant transformation requires more than a single genetic alteration. We can distinguish between two types of genetic alterations that might make us more likely to develop a particular type of cancer—those that we inherit from our parents (germ-line mutations) and those that occur during our own lifetime (somatic mutations). There are a few types of mutations that we can inherit that make us much more likely to develop cancer. The study of these mutations has taught us a great deal about how malfunctioning genes can lead to the development of cancer; some of these inherited cancer syndromes will be discussed later in this section. However, for the most part, inherited mutations are not a major factor in the occurrence of most cases of the disease. One way to determine an overall estimate of the impact of inheritance in tumor formation is to ascertain the likelihood that two identical twins will develop the same type of cancer by the time the individuals reach a certain age. Studies of this type suggest that the likelihood two 75-yearold identical twins will share a particular cancer, such as breast cancer or prostate cancer, is generally between 10 and 15 per-

cent, depending on the type of cancer. Clearly, the genes that we inherit have a significant influence on our risks of developing cancer, but the greatest impact comes from genes that are altered during our lifetime. The development of a malignant tumor (tumorigenesis) is a multistep process characterized by a progression of permanent genetic alterations in a single line of cells, which may occur over the course of many successive cell divisions and take decades to complete. Each genetic change may elicit a particular feature of the malignant state, such as protection from apoptosis, as discussed in Section 16.1. As these genetic changes gradually occur, the cells in the line become increasingly less responsive to the body’s normal regulatory machinery and better able to invade normal tissues. According to this concept, tumorigenesis requires that the cell responsible for initiating the cancer be capable of a large number of cell divisions. This requirement has focused a great deal of attention on the types of cells that are present in a tissue that might have the potential to develop into a tumor. The most common solid tumors—such as those of the breast, colon, prostate, and lung—arise in epithelial tissues that are normally engaged in a relatively high level of cell division. The same is true of leukemias, which develop in rapidly dividing blood-forming tissues. The cells of most tissues can be roughly divided into three groups: (1) stem cells, which possess unlimited proliferation potential, have the capacity to produce more of themselves, and can give rise to all of the cells of the tissue (page 20); (2) progenitor cells, which are derived from stem cells and possess a limited ability to proliferate; and (3) the differentiated end products of the tissue, which generally lack the capability to divide. Examples of these three groups of cells are illustrated in Figure 17.6. Given the fact that tumor formation requires that a cell be capable of extensive division, two general scenarios have been considered for the origin of tumors. According to one scenario, cancer arises from within the relatively small population of stem cells that inhabit each adult tissue. Given their long life and unlimited division potential, stem cells have the opportunity to accumulate the mutations required for malignant transformation. According to another scenario, progenitor cells can give rise to malignant tumors by acquiring certain properties, such as the capacity for unlimited proliferation, as part of the process of tumor progression. As illustrated in Figure 16.7, these two scenarios are not mutually exclusive, in that some tumors are thought to arise from stem cells and others from the progenitor cell population. As a cancer grows, the cells in the tumor mass are subjected to a type of natural selection that drives the accumulation of cells with properties that are most favorable for tumor growth. For example, only those tumors containing cells that maintain the length of their telomeres will be capable of unlimited growth (page 508). Any cell that appears within a tumor that happens to express telomerase will have a tremendous growth advantage over other cells that fail to express the enzyme. Over time, the telomerase-expressing cells will flourish while the nonexpressing cells will die off, and all of the cells in the tumor will contain telomerase. Expression of telomerase illustrates another important feature of tumor progression;

670

Oncogenic event A Tissue stem cell

Tumor subtype x

Oncogenic event A Pluripotent progenitor cell

Oncogenic event A

Tumor subtype y (a)

Committed progenitor cell

Committed progenitor cell Tumor subtype z

Mature cells

Chapter 16 Cancer

Figure 16.7 The proposed cells of origin of malignant tumors. Tissues contain cells in various stages of commitment and differentiation. These include stem cells, multipotent progenitor cells that can give rise to a variety of types of differentiated cells, committed progenitor cells that can give rise to only one type of differentiated cell, and the differentiated cells themselves (see Figure 17.6 for examples). According to the model depicted here, tumors can arise from either tissue stem cells or progenitor cells, though in some cases at least, these different cells of origin give rise to different types of cancers (indicated by the three different colors of the tumors. ( J. E. VISVADER, NATURE 469:316, 2011 FIGURE 2B. NATURE BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/ TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

not all of these changes result from genetic mutation. The activation of telomerase expression can be considered an epigenetic change, one that results from the activation of a gene that is normally repressed. As discussed in Chapter 12, this type of activation process likely involves a change in the structure of chromatin in and around the gene and/or a change in the state of DNA methylation. Once the epigenetic change has occurred, it is transmitted to all of the progeny of that cell and, consequently, represents a permanent, inheritable alteration. Even after they have become malignant, cancer cells continue to accumulate mutations and epigenetic changes that make them increasingly abnormal (as is evident in Figure 16.5). This genetic instability makes the disease difficult to treat by conventional chemotherapy because cells often arise within the tumor mass that are resistant to the drug. The genetic changes that occur during tumor progression are often accompanied by histological changes, that is, changes

(b)

Figure 16.8 Detection of abnormal (premalignant) cells in a Pap smear. (a) Normal squamous epithelial cells of the cervix. The cells have a uniform shape with a small centrally located nucleus. (b) Abnormal cells from a case of carcinoma in situ, which is a preinvasive cancer of the cervix. The cells have heterogeneous shapes and large nuclei. (A: DR. E. WALKER/PHOTO RESEARCHERS, INC.; B: SPL/PHOTO RESEARCHERS, INC.)

in the appearance of the cells. The initial changes often produce cells that can be identified as “precancerous,” which indicates that they have gained some of the properties of a cancer cell, such as loss of certain growth controls, but lack the capability to invade normal tissues or metastasize to distant sites. The Pap smear is a test for detecting precancerous cells in the epithelial lining of the cervix. The development of cervical cancer typically progresses over a period of more than 10 years and is characterized by cells that appear increasingly abnormal (less well differentiated than normal cells, with larger nuclei, as in Figure 16.8). When cells having an abnormal appearance are detected, the precancerous lesion in the cervix can be located and destroyed by laser treatment, freezing, or surgery. Some tissues often generate benign tumors, which contain cells that have proliferated to form a mass that poses little threat of becoming malignant. The moles that we all possess are an example of benign tumors. Studies indicate that the pigment cells that compose a mole have undergone a response that causes them to enter a permanent state of growth arrest, referred to as

671

senescence. Senescence is apparently triggered in these pigment cells after they have undergone certain of the genetic changes that would have otherwise set them on a course to becoming a malignant cancer. This process of “forced senescence” represents another pathway that has evolved to restrict the development of cancers in higher organisms. The molecular basis of senescence is discussed further on page 677.

Tumor-Suppressor Genes and Oncogenes: Brakes and Accelerators The genes that have been implicated in carcinogenesis are divided into two broad categories: tumor-suppressor genes and oncogenes. Tumor-suppressor genes act as a cell’s brakes; they encode proteins that restrain cell growth and prevent cells from becoming malignant (Figure 16.9a). The existence of such genes originally came to light from studies in the late 1960s in which normal and malignant rodent cells were fused to one another. Some of the cell hybrids formed from this type of fusion lost their malignant characteristics, suggesting that a normal cell possesses factors that can suppress the uncontrolled growth of a cancer cell. Further evidence for the existence of tumor-suppressor genes was gathered from observations that specific regions of particular chromosomes are consistently deleted in cells of certain types of cancer. If the absence of such genes is correlated with the development of a tumor, then it follows that the presence of these genes normally suppresses the formation of the tumor. Oncogenes, on the other hand, encode proteins that promote the loss of growth control and the conversion of a cell to

a malignant state (Figure 16.9b). Most oncogenes act as accelerators of cell proliferation, but they have other roles as well. Oncogenes may lead to genetic instability, prevent a cell from becoming a victim of apoptosis, or promote metastasis. The existence of oncogenes was discovered through a series of investigations on RNA tumor viruses that is documented in the Experimental Pathways at the end of this chapter. These viruses transform a normal cell into a malignant cell because they carry an oncogene that encodes a protein that interferes with the cell’s normal activities. The turning point in these studies came in 1976, when it was discovered that an oncogene called src, carried by an RNA tumor virus called avian sarcoma virus, was actually present in the genome of uninfected cells. The oncogene, in fact, was not a viral gene, but a cellular gene that had become incorporated into the viral genome during a previous infection. It soon became evident that cells possess a variety of genes, now referred to as protooncogenes, that have the potential to subvert the cell’s own activities and push the cell toward the malignant state. As discussed below, proto-oncogenes encode proteins that have various functions in a cell’s normal activities. Protooncogenes can be converted into oncogenes (i.e., activated) by several mechanisms (Figure 16.10): 1. The gene can be mutated in a way that alters the proper-

ties of the gene product so that it no longer functions normally (Figure 16.10, path a). 2. The gene can become duplicated one or more times, resulting in gene amplification and excess production of the encoded protein (Figure 16.10, path b).

Proto-oncogene

Normal cell growth Mutated tumorsuppressor gene

Mutated protooncogene has become oncogene

Normal cell growth Copies of the mutated tumor-suppressor gene on both homologues

Normal cell growth

Normal cell growth

Normal cell growth

Loss of growth control

(b)

shortly, oncogenes arise from proto-oncogenes as the result of gainof-function mutations, that is, mutations that cause the gene product to exhibit new functions that lead to malignancy. Tumor-suppressor genes, in contrast, suffer loss-of-function mutations and/or epigenetic inactivation that render them unable to restrain cell growth.

16.3 The Genetics of Cancer

(a)

Figure 16.9 Contrasting effects of mutations in tumorsuppressor genes (a) and oncogenes (b). Whereas a mutation in one of the two copies (alleles) of an oncogene may be sufficient to cause a cell to lose growth control, both copies of a tumor-suppressor gene must be knocked out to induce the same effect. As discussed

Loss of growth control

672 Figure 16.10 Activation of a proto-oncogene to an oncogene. Activation can be accomplished in several ways as indicated in this figure. In pathway a, a mutation in the gene alters the structure and function of the encoded protein. In pathway b, gene amplification results in overexpression of the gene. In pathway c, a rearrangement of the DNA brings a new DNA segment into the vicinity or up against the gene, altering either its expression or the structure of the encoded protein.

Encoded protein with altered structure/function

Mutation or deletion a

Regulatory region

Protooncogene

Gene duplication

Increased synthesis of encoded protein

b

c

OR Protein encoded by proto-oncogene

A DNA regulatory sequence translocated from distant site alters expression of downstream gene

Synthesis of a protein containing portions encoded by different genes. The fusion protein is no longer under normal control

A protein-coding gene translocated from distant site fuses with portion of gene causing formation of a fusion gene

Increased synthesis of encoded protein

3. A chromosome rearrangement can occur that brings a

Chapter 16 Cancer

DNA sequence from a distant site in the genome into close proximity of the gene, which can either alter the expression of the gene or the nature of the gene product (Figure 16.10, path c). Any of these genetic alterations can cause a cell to become less responsive to normal growth controls. Oncogenes act dominantly, which is to say that a single copy of an oncogene can cause the cell to express the altered phenotype, regardless of whether or not there is a normal, unactivated copy of the gene on the homologous chromosome (Figure 16.9b). Researchers have taken advantage of this property to identify oncogenes by introducing the DNA suspected of containing the gene into cultured cells and monitoring the cells for evidence of altered growth properties (page 697). We saw earlier that the development of a human malignancy requires more than a single genetic alteration. The reason becomes more apparent with the understanding that there are two types of genes responsible for tumor formation. As long as a cell has its full complement of tumor-suppressor genes, it is thought to be protected against the effects of an oncogene for reasons that will be evident when the functions of these genes are discussed below. Most tumors contain alterations in both tumor-suppressor genes and oncogenes, suggesting that the loss of a tumor-suppressor function within a cell must be accompanied by the conversion of a protooncogene into an oncogene before the cell can become fully malignant. Even then, the cell may not exhibit all of the properties required to invade surrounding tissues or to form sec-

ondary colonies by metastasis. Mutations in additional genes, such as those encoding cell-adhesion molecules or extracellular proteases (discussed on page 256), may be required before these cells acquire the full life-threatening phenotype. We can now turn to the functions of the products encoded by both tumor-suppressor genes and oncogenes and examine how mutations in these genes can cause a cell to become malignant. Tumor-Suppressor Genes The transformation of a normal cell to a cancer cell is accompanied by the loss of function of one or more tumor-suppressor genes. At the present time, more than two dozen genes have been implicated as tumor suppressors in humans, some of which are listed in Table 16.1. Included among the genes on the list are those that encode transcription factors (e.g., TP53 and WT1), cell cycle regulators (e.g., RB and INK4a), components that regulate G proteins, (NF1), a phosphoinositide phosphatase (PTEN ), and a protein that regulates protein degradation (VHL).2 In one way or another, most of the proteins encoded by tumor-suppressor genes act as negative regulators of cell proliferation, which is why their elimination promotes uncontrolled cell growth. The products of tumor-suppressor genes also help maintain genetic stability, which may be a primary reason that tumors contain such an aberrant karyotype (Figure 16.5). Some 2

For the present chapter, which deals primarily with human biology, we will follow a convention that is commonly used: human genes are written in capital letters (e.g., APC), mouse genes are written with the first letter capitalized (e.g., Brca1), and viral genes are written in lower case (e.g., src).

673

Table 16.1 Tumor-Suppressor Genes Gene

Primary tumor

Proposed function

Inherited syndrome

APC BRCA1 MSH2, MLH1 E-Cadherin INK4a

Colorectal Breast Colorectal Breast, colon, etc. Melanoma, pancreatic

Familial adenomatous polyposis Familial breast cancer HNPCC Familial gastric cancer Familial melanoma

NF1 NF2 TP53 PTEN RB VHL WT1

Neurofibromas Meningiomas Sarcomas, lymphomas, etc. Breast, thyroid Retinal Kidney Wilms tumor of kidney

Binds ␤-catenin acting as transcription factor DNA repair Mismatch repair Cell adhesion molecule p16: Cdk inhibitor ARF: stabilizes p53 Activates GTPase of Ras Links membrane to cytoskeleton Transcription factor (cell cycle and apoptosis) PIP3 phosphatase Binds E2F (cell cycle transcription regulation) Protein ubiquitination and degradation Transcription factor

proposed that the development of retinoblastoma requires that both copies of the RB gene of a retinal cell be either eliminated or mutated before the cell can give rise to a retinoblastoma. In other words, the cancer arises as the result of two independent “hits” in a single cell. In cases of sporadic retinoblastoma, the tumor develops from a retinal cell in which both copies of the RB gene have undergone successive spontaneous mutation (Figure 16.11a). Because the chance that both alleles of the same gene will be the target of debilitating mutations in the same cell is extremely unlikely, the incidence of the cancer in the general population is extremely low. In contrast, the cells of a person who inherits a chromosome with an RB deletion are already halfway along the path to becoming malignant. Mutation or deletion of the remaining RB allele in any of the cells of the retina produces a cell that lacks a normal RB gene and thus cannot produce a functional RB gene product (Figure 16.11b). This explains why individuals who inherit an abnormal RB gene are so highly predisposed to developing the cancer. The second “hit” fails to occur in approximately 10 percent of these individuals, who do not develop the disease. Knudson’s hypothesis was subsequently confirmed by examining cells from patients with an inherited disposition to retinoblastoma and finding that, as predicted, both alleles of the gene were missing or mutated in the cancer cells. Individuals with sporadic retinoblastomas had normal cells that lacked RB mutations and tumor cells in which both alleles of the gene were mutated. Although deficiencies in the RB gene are first manifested in the development of retinal cancers, this is not the end of the story. People who suffer from the inherited form of retinoblastoma are also at high risk of developing other types of tumors later in life, particularly soft-tissue sarcomas (tumors of mesenchymal rather than epithelial origin). The consequences of RB mutations are not confined to persons who inherit a mutant allele. Mutations in RB alleles are a common occurrence in sporadic breast, prostate, and lung cancers among individuals who have inherited two normal RB alleles. When cells from these tumors are cultured in vitro, the reintroduction of a wild-type RB gene back into the cells is generally sufficient to suppress their cancerous phenotype, indicating that the loss

16.3 The Genetics of Cancer

tumor-suppressor genes are involved in the development of a wide variety of different cancers, whereas others play a role in the formation of one or a few cancer types. It is common knowledge that members of some families are at high risk of developing certain types of cancers. Although these inherited cancer syndromes are rare, they provide an unprecedented opportunity to identify tumorsuppressor genes that, when missing, contribute to the development of both inherited and sporadic (i.e., noninherited) forms of cancer. The first tumor-suppressor gene to be studied and eventually cloned—and one of the most important—is associated with a rare childhood cancer of the retina of the eye, called retinoblastoma. The gene responsible for this disorder is named RB. The incidence of retinoblastoma follows two distinct patterns: (1) it occurs at high frequency and at young age in members of certain families, and (2) it occurs sporadically at an older age among members of the population at large. The fact that retinoblastoma runs in certain families suggested that the cancer can be inherited. Examination of cells from children suffering from retinoblastoma revealed that one member of the thirteenth pair of homologous chromosomes was missing a small piece from the interior portion of the chromosome. The deletion was present in all of the children’s cells—both the cells of the retinal cancer and cells elsewhere in the body—indicating that the chromosomal aberration had been inherited from one of the parents. Retinoblastoma is inherited as a dominant genetic trait because members of high-risk families that develop the disease inherit one normal allele and one abnormal allele. But unlike most dominantly inherited conditions, such as Huntington’s disease, where an individual who inherits a missing or an altered gene invariably develops the disorder, children who inherit a chromosome missing the retinoblastoma gene inherit a strong disposition toward developing retinoblastoma, rather than the disorder itself. In fact, approximately 10 percent of individuals who inherit a chromosome with an RB deletion never develop the retinal cancer. How is it that a small percentage of these predisposed individuals escape the disease? The genetic basis of retinoblastoma was explained in 1971 by Alfred Knudson of the University of Texas. Knudson

Neurofibromatosis type 1 Neurofibromatosis type 2 Li-Fraumeni syndrome Cowden disease Retinoblastoma von Hippel-Lindau syndrome Wilms tumor

674 RB gene

Retinal cell

Mutated RB gene inherited from parent

Retinal cell

Normal cell growth Spontaneous mutation in one copy of RB gene

Normal cell growth

Normal cell growth

Spontaneous mutation in second copy of RB gene

Normal cell growth Normal cell growth

Loss of growth control

Loss of growth control

(a)

Chapter 16 Cancer

Spontaneous mutation in second copy of RB gene

Normal cell growth

(b)

Figure 16.11 Mutations in the RB gene that can lead to retinoblastoma. (a) In sporadic (i.e., nonfamilial) cases of the disease, an individual begins life with two normal copies of the RB gene in the zygote, and retinoblastoma occurs only in those rare individuals in whom a given retinal cell accumulates independent mutations in both alleles of the gene. (b) In familial (i.e., inherited) cases of the disease, an

individual begins life with one abnormal allele of the RB gene, usually present as a deletion. Thus all the cells of the retina have at least one of their two RB genes nonfunctional. If the other RB allele in a retinal cell becomes inactivated, usually as the result of a point mutation, that cell gives rise to a retinal tumor.

of this gene function contributes significantly to tumorigenesis. Let’s look more closely at the role of the RB gene.

activator. As the end of G1 approaches, the pRB subunit of the pRB–E2F complex is phosphorylated by the cyclin-dependent kinases that regulate the G1–S transition. Once phosphorylated, pRB releases its bound E2F, allowing the transcription factor to activate gene expression, which marks the cell’s irreversible commitment to enter S phase. A cell that loses pRB activity as the result of RB mutation would be expected to lose its ability to inactivate E2F, thereby removing certain restraints over the entry to S phase. E2F is only one of dozens of proteins capable of binding to pRB, suggesting that pRB has numerous other functions. The complexity of pRB interactions is also suggested by the fact that the protein contains at least 16 different serine and threonine residues that can be phosphorylated by cyclin-dependent kinases. It is likely that phosphorylation of different combinations of amino acid residues allows the protein to interact with different downstream targets. The importance of pRB as a negative regulator of the cell cycle is demonstrated by the fact that DNA tumor viruses (including adenoviruses, human papilloma virus, and SV40) encode a protein that binds to pRB, blocking its ability to bind to E2F. The ability of these viruses to induce cancer in infected cells depends on their ability to block the negative influence that pRB has on progression of a cell through the cell cycle. By using these pRB-blocking proteins, these viruses accomplish the same result as when the RB gene is deleted, leading to the development of human tumors.

The Role of pRB in Regulating the Cell Cycle The importance of the cell cycle in cell growth and proliferation was discussed in Chapters 14 and 15, where it was noted that factors that control the cell cycle can play a pivotal role in the development of cancer. In its best studied role, the protein encoded by the RB gene, pRB, helps regulate the passage of cells from the G1 stage of the cell cycle into S phase, during which DNA synthesis occurs. As discussed on page 576, the transition from G1 to S is a time of commitment for the cell; once a cell enters S phase, it invariably proceeds through the remainder of the cell cycle and into mitosis. The transition from G1 to S is accompanied by the activation of many different genes that encode proteins ranging from DNA polymerases to cyclins and histones. Among the transcription factors involved in activating genes required for Sphase activities are members of the E2F family of transcription factors, which are key targets of pRB. A model depicting the role of pRB in controlling E2F activity is illustrated in Figure 16.12. During G1, E2F proteins are normally bound to pRB, which prevents the E2F molecules from activating a number of genes encoding proteins required for S-phase activities (e.g., cyclin E and DNA polymerase ␣). Studies suggest (as indicated in step 1 of Figure 16.12) that the E2F–pRB complex is associated with DNA but acts as a gene repressor rather than a gene

675 E2F

pRB

1

Gene repression Cdk activation leads to pRb phosphorylation and dissociation from E2F E2F

The importance of p53 as an antitumor weapon is most evident from the fact that TP53 is the most commonly mutated gene in human cancers; approximately half of all human tumors contain cells with point mutations or deletions in both alleles of the TP53 gene (Figure 16.13a). Furthermore, tumors composed of cells bearing TP53 mutations are correlated with a poorer survival rate than those containing a wild-type TP53 gene. Clearly, the elimination of TP53 func-

2

+

P

P

pRB

Pancreatic cancer Ovarian cancer Colon cancer Lung cancer Liver cancer Stomach cancer Kidney cancer Prostate cancer Breast cancer Melanoma

P

Transcription 3

E2F Gene activation

mRNA 0

Encoded protein (a)

4

5

G1

10

20

30

G245

S

R175 Figure 16.12 The role of pRB in controlling transcription of genes required for progression of the cell cycle. During most of G1, the unphosphorylated pRB is bound to the E2F protein. The E2F–pRB complex binds to regulatory sites in the promoter regions of numerous genes involved in cell cycle progression, acting as a transcriptional repressor that blocks gene expression. Repression probably involves the methylation of lysine 9 of histone H3 that modulates chromatin architecture (page 501). Activation of the cyclin-dependent kinase (Cdk) leads to the phosphorylation of pRB, which can no longer bind the E2F protein (step 2). In the pathway depicted here, loss of the bound pRB converts the DNA-bound E2F into a transcriptional activator, leading to expression of the genes being regulated (step 3). The mRNA is translated into proteins (step 4) that are required for the progression of cells from G1 into S phase of the cell cycle (step 5). Other roles of pRB have been identified but are not discussed.

50

60

70

80

90

100

R249 R248

R273

R282

(b)

Figure 16.13 The role of the tumor suppressor gene TP53 in human cancer. (a) The frequency with which both alleles of the TP53 gene are mutated in different types of cancers. The data refers to the most common form of each of these 10 types of cancers. (b) p53 function is particularly sensitive to mutations in its DNA-binding domain. p53 functions as a tetramer, each subunit of which consists of several domains with different functions. This image shows a ribbon drawing of the DNA-binding domain. The six amino acid residues most often mutated in p53 molecules that have been debilitated in human cancers are indicated in a single-letter nomenclature (Figure 2.26). These residues occur at or near the protein–DNA interface and either directly impact the binding of the protein to DNA or alter its conformation. (A: COSMIC V54 RELEASE FORBES ET AL., 2011. MAR 39, D945, ’11 DATABASE ISSUE. CELL 148:412, COPYRIGHT 2012 ELSEVIER INC. CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER; B: FROM Y. CHO, S. GORINA, P. D. JEFFREY, N. P. PAVLETICH, SCIENCE 265:352, 1994. REPRINTED WITH PERMISSION FROM AAAS.)

16.3 The Genetics of Cancer

The Role of p53: Guardian of the Genome The TP53 gene may have more to do with the development of human cancer than any other component of the genome. The gene gets its name from the product it encodes, p53, which is a polypeptide having a molecular mass of 53,000 daltons. In 1990, TP53 was recognized as the tumor-suppressor gene that, when absent, is responsible for a rare inherited disorder called Li-Fraumeni syndrome. Victims of this disease are afflicted with a very high incidence of various cancers, including breast and brain cancer and leukemia. Like individuals with the inherited form of retinoblastoma, persons with Li-Fraumeni syndrome inherit one normal and one abnormal (or deleted) allele of the TP53 tumor-suppressor gene and are thus highly susceptible to cancers that result from random mutations in the normal allele.

40

% Mutations of TP53 gene

676

tion is an important step in the progression of many cancer cells toward the fully malignant state. Why is the presence of p53 so important in preventing a cell from becoming malignant? For one thing, p53 seems to bind a very long list of different proteins, as well as DNA, and is involved in a diverse array of cellular activities. In its best studied role, p53 serves as a transcription factor that acts as crucial player in a cell’s response to stress. When a cell sustains DNA damage, p53 responds by altering the expression of a large number of genes involved in cell cycle regulation, apoptosis, and/or senescence. The importance of the transcriptionregulating role of p53 is evident in Figure 16.13b, which shows the location of the six mutations most commonly found to disable p53 in human cancers; all of them map in the region of the protein that interacts with DNA. One of the best studied genes activated by p53 encodes a protein called p21 that inhibits the cyclin-dependent kinase that normally drives a cell through the G1 checkpoint. As the level of p53 rises in the damaged G1 cell, expression of the p21 gene is activated, and progression through the cell cycle is arrested (see Figure 14.9). This gives the cell time to repair the genetic damage before it initiates DNA replication. When both copies of the TP53 gene in a cell have been mutated so that their product is no longer functional, the cell can no longer produce the p21 inhibitor or exercise the feedback control that prevents it from entering S phase when it is not prepared to do so. Failure to repair DNA damage leads to the production of abnormal cells that have the potential to become malignant. Cell cycle arrest is not the only way that p53 protects an organism from developing cancer. Alternatively, p53 can direct a genetically damaged cell along a pathway that leads to death by apoptosis, thereby ridding the body of cells with a malignant potential. p53 is thought to direct a cell into apoptosis as the result of several events, including the activation of expression of the BAX gene, whose encoded protein initiates apoptosis (page 660). Not all actions of p53 are dependent on its activation of transcription. p53 is also capable of binding directly to several members of the Bcl-2 family proteins (page 659) in a manner that stimulates apoptosis. For example, p53 can bind to Bax

proteins at the outer mitochondrial membrane, directly triggering membrane permeabilization and release of apoptotic factors. If both alleles of TP53 should become inactivated, a cell that is carrying damaged DNA fails to be destroyed, even though it lacks the genetic integrity required for controlled growth (Figure 16.14). Several studies have shown that established tumors in mice will undergo regression when the activity of their p53 genes is restored. This finding suggests that tumor development continues to depend on the absence of a functional TP53 gene, even after its cells become genetically unstable. For these reasons, the development of therapies that restore p53 function to p53-deficient cells has become an active area of research. The most advanced therapy based on this strategy involves injection into the tumor of an adenovirus that carries a wild-type TP53 gene. This approach has been used widely in China, but a similar adenoviral vector (Advexin) has not been approved by the FDA at the time of this writing. The level of p53 in a healthy G1 cell is very low, which keeps its potentially lethal action under control. However, if a G1 cell sustains genetic damage, as occurs if the cell is subjected to ultraviolet light or chemical carcinogens, the concentration of p53 rises rapidly. A similar response can be elicited simply by injecting a cell with DNA containing broken strands. The increase in p53 levels is not due to increased expression of the gene but to an increase in the stability of the protein. In unstressed cells, p53 has a half-life of a few minutes. p53 degradation is facilitated by a protein called MDM2, which binds to p53 and escorts it out of the nucleus and into the cytosol. Once in the cytosol, MDM2 adds ubiquitin molecules to the p53 molecule, leading to its destruction by a proteasome (page 541). How does DNA damage lead to stabilization of p53? We saw on page 580 that persons suffering from ataxia telangiectasia lack a protein kinase called ATM and are unable to respond properly to DNA-damaging radiation. ATM is normally activated following DNA damage, and p53 is one of the proteins ATM phosphorylates. The phosphorylated version of the p53 molecule is no longer able to interact with MDM2, which stabilizes existing p53 molecules in the nucleus and allows them to activate the expression of genes such as p21 and BAX (see Figure 16.16).

Repair before division DNA damage

p53 level rises G1 arrest

Division with damage (mutation, aneuploidy) DNA damage

Chapter 16 Cancer

(b)

Figure 16.14 A model for the function of p53. (a) Cell division does not normally require the involvement of p53. (b) If, however, the DNA of a cell becomes damaged as the result of exposure to mutagens, the level of p53 rises and acts either to arrest the progression of the cell through G1 or to direct the cell toward apoptosis. (c) If both copies of the TP53 gene are inactivated, the cell loses the ability to arrest the cell cycle or commit the cell to apoptosis following DNA damage. As a

Tumor

Mitotic failure and cell death

or apoptosis (a)

No p53 No G1 arrest

(c)

result, the cell either dies from mitotic failure or continues to proliferate with genetic abnormalities that may lead to the formation of a malignant growth. (D. P. LANE, REPRINTED WITH PERMISSION FROM NATURE 358:15, 1992; COPYRIGHT 1992. NATURE BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

677 Untreated

5-Fluorouracil

Etoposide

Adriamycin

(+/+)

(+/−)

(−/−)

indicated at the top of the other three columns. It is evident that the compounds had a dramatic effect on arresting growth and inducing cell death (apoptosis) in normal cells, whereas the cells lacking p53 continued to proliferate in the presence of these compounds. (FROM SCOTT W. LOWE, H. E. RULEY, T. JACKS, AND D. E. HOUSMAN, CELL 74:959, 1993, WITH PERMISSION FROM ELSEVIER.)

Some tumor cells have been found that contain a wild-type TP53 gene but extra copies of MDM2. Such cells are thought to produce excessive amounts of MDM2, which prevents p53 from building to required levels to stop the cell cycle or induce apoptosis following DNA damage (or other oncogenic stimuli). A major effort is underway to develop drugs that block the interaction between MDM2 and p53 in an attempt to restore p53 activity in cancer cells that retain this key tumor suppressor. The relationship between MDM2 and p53 has also been demonstrated using gene knockouts. Mice that lack a gene encoding MDM2 die at an early stage of development, presumably because their cells undergo p53-dependent apoptosis. This interpretation is supported by the finding that mice lacking genes that encode both MDM2 and p53 (double knockouts) survive to adulthood but are highly prone to cancer. Because these embryos cannot produce p53, they don’t require a protein such as MDM2 that facilitates p53 destruction. This observation illustrates an important principle in cancer genetics: even if a “crucial” gene such as RB or TP53 is not mutated or deleted, the function of that gene can be affected as the result of alterations in other genes whose products are part of the same pathway as the “crucial” gene. In this case, overexpression of MDM2 can have the same effect as the absence of p53. As long as the tumor-suppressor pathway is blocked, the tumor-suppressor gene itself need not be mutated. Numerous studies indicate that both the p53 and pRB pathways have to be inactivated, one way or another, to allow the progression of most tumor cells. Because of its ability to trigger apoptosis, p53 plays a pivotal role in treatment of cancer by radiation and chemotherapy. It was assumed for many years that cancer cells are more susceptible

than normal cells to drugs and radiation because cancer cells divide more rapidly. But some cancer cells divide more slowly than their normal counterparts, yet they are still more sensitive to drugs and radiation. An alternate theory suggests that normal cells are more resistant to drugs or radiation because, once they sustain genetic damage, they either arrest their cell cycle until the damage is repaired or they undergo apoptosis. In contrast, cancer cells that have sustained DNA damage are more likely to become apoptotic—as long as they possess a functioning TP53 gene. If cancer cells lose p53 function, they often cannot be directed into apoptosis and they become highly resistant to further treatment (Figure 16.15). This may be the primary reason why tumors that typically lack a functional TP53 gene (e.g., colon cancer, prostate cancer, and pancreatic cancer) respond much more poorly to radiation and chemotherapy than tumors that possess a wild-type copy of the gene (e.g., testicular cancer and childhood acute lymphoblastic leukemias). The Role of p53 in Promoting Senescence We have seen how p53 can direct a potential cancer cell into either growth arrest or apoptosis. Recent studies indicate that p53 also controls signaling pathways that lead to cellular senescence, another mechanism that has evolved as a barrier that stops wayward cells from developing into malignant tumors. Unlike apoptotic cells, senescent cells can remain alive and metabolically active but are permanently arrested in a nondividing state, as exemplified by the senescent melanocytes found in moles (discussed on page 670). In other cases, senescent cells may be ingested by phagocytic immune cells. Senescence can be triggered in an otherwise normal cell by the experimental

16.3 The Genetics of Cancer

Figure 16.15 Experimental demonstration of the role of p53 in the survival of cells treated with chemotherapeutic agents. Cells were cultured from mice that had two functional alleles of the gene encoding p53 (top row), one functional allele of the gene (middle row), or were lacking a functional allele of the gene (bottom row). Cultures of each of these cells were grown either in the absence of a chemotherapeutic agent (first column) or in the presence of one of the three compounds

678

Chapter 16 Cancer

activation of an oncogene, such as Ras, which might occur with some frequency during the day-to-day activities of dividing cells in a normal tissue. Studies suggest that oncogene activation triggers a period of accelerated division after which the senescence program takes effect and slams on the brakes. This is the apparent route that is taken during the formation of benign moles. One of the pathways leading to senescence involves expression of a tumor-suppressor gene called INK4a, which is often disabled in human cancers (Table 16.1). INK4a encodes two separate tumor-suppressor proteins (proteins that are translated in alternate reading frames of the mRNA): p16, which is an inhibitor of cyclin-dependent kinases required for progression through the cell cycle, and ARF, which stabilizes p53 by inhibiting MDM2. The precise role of p53 in directing cells towards the senescent state remains unclear, but inactivation of the TP53 gene within senescent cells can cause the cells to resume their progress toward full malignancy. Whether p53 moves a cell toward cell cycle arrest, apoptosis, or senescence apparently depends on the type of posttranslational modifications to which it is subjected. As in the case of the core histones (page 499), modifications include phosphorylation, acetylation, methylation, and ubiquitination and affect more than three dozen residues within the p53 molecule. Moreover, the fact 1) that the TP53 transcript can be alternately spliced into numerous p53 isoforms, 2) that these p53 proteins can interact with a host of different proteins, and 3) that p53 has been found to influence several other major tumor-related pathways (e.g., DNA repair, glucose metabolism, and autophagy) add additional layers of complexity to the p53 story. Dissecting the roles of these various factors in the function of this “multitasking” protein will be a daunting challenge. Other Tumor-Suppressor Genes Although mutations in RB and TP53 are associated with a wide variety of human malignancies, mutations in a number of other tumor-suppressor genes are detected in only a few types of cancer. Familial adenomatous polyposis coli (FAP) is an inherited disorder in which individuals develop hundreds or even thousands of premalignant polyps (adenomas) from epithelial cells that line the colon wall. If not removed, cells within some of these polyps are very likely to progress to a fully malignant stage. The cells of patients with this condition were found to contain a deletion of a small portion of chromosome 5, which was subsequently identified as the site of a tumor-suppressor gene called APC. A person inheriting an APC deletion is in a similar position to one who inherits an RB deletion: if the second allele of the gene is mutated in a given cell, the protective value of the gene function is lost. The loss of the second allele of APC causes the cell to lose growth control and proliferate to form a polyp rather than differentiating into normal epithelial cells of the intestinal wall. The conversion of cells in a polyp to the more malignant state, characterized by the ability to metastasize and invade other tissues, is presumably gained by the accumulation of additional mutations, including those in TP53 (see Figure 16.20). Mutated APC genes are found not only in inherited forms of colon cancers, but also in the majority of sporadic colon tumors, suggesting that the gene plays a major role in the development of this disease. In its best

studied role, APC suppresses the Wnt pathway, which activates the transcription of genes (e.g., MYC and CCND1) that promote cell proliferation. APC may also play a role in the attachment of microtubules to the kinetochores of mitotic chromosomes. Loss of APC function could therefore lead directly to abnormal chromosome segregation and aneuploidy (page 596). The presence of mutated APC DNA has been found in the blood of persons with early-stage colon cancer, which raises the possibility of a diagnostic test for the disease. It is estimated that breast cancer strikes approximately one in eight women living in the United States, Canada, and Europe. Of these cases, 5–10 percent are due to the inheritance of a gene that predisposes the individual to development of the disease. After an intensive effort by several laboratories, two genes named BRCA1 and BRCA2 were identified in the mid-1990s as being responsible for the majority of the inherited cases of breast cancer. BRCA mutations also predispose a woman to the development of ovarian cancer, which has an especially high mortality rate. It was pointed out on page 579, that cells possess checkpoints that halt progression of the cell cycle following DNA damage. The BRCA proteins are part of one or more large protein complexes that respond to DNA damage and activate DNA repair by means of homologous recombination. Cells with mutant BRCA proteins accumulate chromosomal breaks and exhibit a highly aneuploid karyotype. In cells with a functional TP53 gene, failure to repair DNA damage leads to the activation of p53, which causes the cell to either arrest cell cycle progress or undergo apoptosis, as illustrated in Figure 16.16. We have seen in this chapter that apoptosis is one of the body’s primary mechanisms of ridding itself of potential tumor cells. The mechanism of apoptosis was discussed in the last chapter, as were pathways that promoted cell survival rather than cell destruction. The best studied cell-survival pathway involves the activation of a kinase called PKB (AKT) by the phosphoinositide PIP3. PIP3, in turn, is formed by the catalytic activity of the lipid kinase PI3K (see Figure 15.25). Activation of the PI3K/PKB pathway leads to an increased likelihood that a cell will survive a stimulus that normally would lead to its destruction. Whether a cell lives or dies following a particular event depends to a large degree on the balance between proapoptotic and antiapoptotic signals. Mutations that affect this balance, such as those that contribute to the overexpression of PKB or PI3K, can shift this balance in favor of cell survival, which can provide a potential cancer cell with a tremendous advantage. Another protein that can affect the balance between life and death of a cell is the lipid phosphatase, PTEN, which removes the phosphate group from the 3-position of PIP3, converting the molecule into PI(4,5)P2, which cannot activate PKB. Cells in which both copies of the PTEN gene are inactivated tend to have an excessively high level of PIP3, which leads to an overactive population of PKB molecules. When a normal PTEN gene is introduced into tumor cells that lack a functioning copy of the gene, the cells typically undergo apoptosis, as would be expected. Like the other tumor-suppressor genes listed in Table 16.1, mutations in PTEN cause a rare hereditary disease characterized by an increased risk of cancer, and such mutations are also found in a variety of sporadic cancers.

679 1

Growth factors e.g., PDGF, EGF

DNA damage

1

2

Growth factor receptors e.g., EGF receptor (HER2)

BRCA1

X X BRCA2

2b

2a

_

+

Failed repair

SRC

BRCA1 BRCA2

7

Checkpoint activation P

MDM2

Cyclin

4b

2–HG p21

8

Bax Cell cycle arrest Apoptosis

Figure 16.16 DNA damage initiates activity of a number of proteins encoded by both tumor-suppressor genes and proto-oncogenes. In this simplified figure, DNA damage is seen to cause double-strand breaks in the DNA (step 1) that are repaired by a proposed multiprotein complex that includes BRCA1 and BRCA2 (step 2a). Mutations in either of the genes that encode these proteins can block the repair process (step 2b). If DNA damage is not repaired, a checkpoint is activated that leads to a rise in the level of p53 activity (step 3a). The p53 protein is normally inhibited by interaction with the protein MDM2 (step 3b). p53 is a transcription factor that activates expression of either (1) the p21 gene (step 4a), whose product (p21) causes cell cycle arrest, or (2) the BAX gene (step 4b), whose product (Bax) causes apoptosis. p53 activation can also promote cellular senescence, but the pathway is unclear. (REPRINTED WITH PERMISSION AFTER J. BRUGAROLAS AND T. JACKS, NATURE MED 3:721, 1997, COPYRIGHT 1997, NATURE MEDICINE BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

Mutation or deletion is not the only mechanism by which tumor-suppressor genes can be inactivated. Tumor suppressor genes, such as BRCA1 or PTEN, are often rendered nonfunctional as the result of epigenetic mechanisms, such as DNA methylation or histone modification, which silences transcription of the gene (page 530).

BCL-2

Mitochondrion

5 6

Transcription factors e.g., MYC, HIF

Proteins that affect epigenetic state of chromatin e.g. DNMT3A

Figure 16.17 A schematic diagram summarizing the types of proteins encoded by proto-oncogenes. These include growth factors (1), receptors for growth factors (2), protein kinases and the proteins that activate them (3), proteins that regulate the cell cycle (4), transcription factors (5), proteins that modify chromatin (6), metabolic enzymes (7), and proteins that inhibit apoptosis (8). Proteins involved in mitosis, tissue invasion, and metastasis are not included.

tated most frequently in human tumors is RAS, which encodes a GTP-binding protein (RAS) that functions as an on–off switch for a number of key signaling pathways controlling cell proliferation (page 640) and metabolism (see Figure 16.18).3 Oncogenic RAS mutants typically encode a protein whose GTPase activity cannot be stimulated, which leaves the molecule in an active GTP-bound form, sending continuous proliferation signals along the pathway. Despite extensive efforts to develop anti-RAS strategies for cancer therapy, no drugs that block RAS function have yet to be approved. The functions of a number of oncogenes are summarized in Figure 16.17 and discussed below.4 3

The human genome actually contains three different RAS genes and three different RAF genes that are active in different tissues. Of these, KRAS and BRAF are most often implicated in tumor formation. 4 The reader is referred to the Human Perspective of Chapter 7 (page 256) for a discussion of genes that encode cell-surface molecules and extracellular proteases that play an important role in tissue invasion and metastasis.

16.3 The Genetics of Cancer

Oncogenes As described above, oncogenes encode proteins that promote the loss of growth control and the conversion of a cell to a malignant state. Oncogenes are derived from protooncogenes (page 695), which are genes that encode proteins having a function in the normal cell. Numerous oncogenes were initially identified as part of the genomes of RNA tumor viruses, but many more have been identified because of their importance in tumorigenesis as determined in laboratory animals or human tumor samples. Different oncogenes become activated in different types of tumors, which reflects variations in the signaling pathways that operate in diverse cell types. The oncogene mu-

Proteins that affect apoptosis e.g., BCL-2

4

Proteins that control cell cycle e.g., CYCLIN D1, CDK2

αKG

IC

3b

4a

Metabolic enzymes e.g. IDH1

CDK

p53

RAF

Protein kinases or proteins that activate protein kinases

Repair 3a

RAS

3

680

Chapter 16 Cancer

Oncogenes That Encode Growth Factors or Their Receptors The first connection between oncogenes and growth factors was made in 1983, when it was discovered that the cancer-causing simian sarcoma virus contained an oncogene (sis) derived from the cellular gene for platelet-derived growth factor (PDGF), a protein present in human blood. Cultured cells that are transformed with this virus secrete large amounts of PDGF into the medium, which causes the cells to proliferate in an uncontrolled fashion. Overexpression of PDGF has been implicated in the development of brain tumors (gliomas). Another oncogenic virus, avian erythroblastosis virus, was found to carry an oncogene (erbB) that encodes an EGF receptor that is missing part of the extracellular domain of the protein that binds the growth factor. One might expect that the altered receptor would be unable to signal a cell to divide, but just the reverse is true. This altered version of the receptor stimulates the cell constitutively, that is, regardless of whether or not the growth factor is present in the medium. This is the reason why cultured cells that carry the altered gene proliferate in an uncontrolled manner. A number of spontaneous human cancers have been found to contain cells with genetic alterations that affect growth factor receptors, including EGFR. Most commonly, the malignant cells contain a much larger number of the receptors in their plasma membranes than do normal cells. The presence of excess receptors makes the cells sensitive to much lower concentrations of the growth factor, and thus, they are stimulated to divide under conditions that would not affect normal cells. As discussed below, growth factor receptors have become a favored target for therapeutic antibodies, which bind to the receptor’s extracellular domain, and for small molecule inhibitors, which bind to the receptor’s intracellular tyrosine kinase domain. Oncogenes That Encode Cytoplasmic Protein Kinases Overactive protein kinases function as oncogenes by generating signals that lead to inappropriate cell proliferation or survival. Raf, for example, is a serine-threonine protein kinase that heads the MAP kinase cascade, the primary growthcontrolling signaling pathway in cells (page 640). It is evident that Raf is well positioned to wreak havoc within a cell should its enzymatic activity become altered as the result of mutation. As with the growth factor receptors and Ras, mutations that turn Raf into an enzyme that remains in the “on” position are most likely to convert the proto-oncogene into an oncogene and contribute to the cell’s loss of growth control. Raf is most closely linked to melanoma, where BRAF mutations play a causative role in the development of approximately 70 percent of these cancers. Another group of cytoplasmic kinases that are often deregulated in cancer are the cyclin-dependent kinases, especially Cdk4 and Cdk6 (Figure 14.8). Cyclin D1, a regulator of these Cdks, is also a frequent oncogene. The first oncogene to be discovered, SRC, is also a protein kinase, but one that phosphorylates tyrosine residues on protein substrates rather than serine and threonine residues. Transformation of a cell by a src-containing tumor virus is accompanied by the phosphorylation of a wide variety of proteins. Included among the apparent substrates of Src are

proteins involved in signal transduction, control of the cytoskeleton, and cell adhesion. For an unknown reason, SRC mutations appear only rarely among the repertoire of genetic changes in human tumor cells. Oncogenes That Encode Transcription Factors A number of oncogenes encode proteins that act as transcription factors. The progression of cells through the cell cycle requires the timely activation (or repression) of a large variety of genes whose products contribute in various ways to cell growth and division. It is not surprising, therefore, that alterations in the proteins that control the expression of these genes could seriously disturb a cell’s normal growth patterns. Probably the best studied oncogene whose product acts as a transcription factor is MYC. Myc regulates the expression of a huge number of proteins and noncoding RNAs (rRNAs, tRNAs, and miRNAs) involved in cell growth and proliferation. When MYC expression is selectively blocked, the progression of the cell through G1 is blocked. The MYC gene is one of the protooncogenes most commonly altered in human cancers, often being amplified within the genome or rearranged as the result of a chromosome translocation. These chromosomal changes are thought to remove the MYC gene from its normal regulatory influences and increase its level of expression in the cell, producing an excess of the Myc protein. One of the most common types of cancer among populations in Africa, called Burkitt’s lymphoma, results from the translocation of a MYC gene to a position adjacent to an antibody gene. The disease occurs primarily in persons who have also been infected with Epstein-Barr virus. This same virus causes only minor infections (e.g., mononucleosis) in people living in Western countries and is not associated with tumorigenesis. Oncogenes That Encode Proteins That Affect the Epigenetic State of Chromatin As discussed in Chapter 12, two of the most important factors in determining the epigenetic state of chromatin are 1) whether particular sites in the DNA of gene promoters are methylated or not and 2) the particular modifications present in the tails of certain core histones within the nucleosomes of these same gene promoters. DNA methylation tends to silence genes whereas histone modifications may either activate or repress gene transcription. Recent studies have indicated that a number of oncogenes encode proteins that affect DNA methylation or histone modifications. These include DNA methyltransferases, histone acetylases and deacetylases, histone methyltransferases and demethylases, and proteins present within chromatin remodeling complexes. Mutations in any of these classes of genes can promote tumorigenesis by increasing or decreasing transcription of genes involved in the various signaling and regulatory pathways that affect cell proliferation, survival, migration, etc. To cite just one example, acute myeloid leukemia is characterized by recurrent mutations in DNMT3A, a gene whose product is involved in maintaining DNA methylation patterns during DNA replication. A reduction in the level of DNA methylation could lead to increased movement of trans-

681

posable elements, which would cause genetic instability, as well as increased transcription of certain proto-oncogenes. Conversely, an increase in the level of DNA methylation of the promoter regions of tumor suppressor genes is known to silence the expression of genes that exert a key inhibitory influence on tumorigenesis. Oncogenes That Encode Metabolic Enzymes It was mentioned on page 667 that tumor cells depend much more on glycolysis than do normal cells. This is only one of a number of major differences in metabolism between normal and tumor cells. Another difference, which was one of the surprising discoveries to emerge from genome sequencing studies, was the repeated presence of mutations in the TCA cycle enzyme isocitrate dehydrogenase (IDH1 and IDH2) in the tumor cells of patients with glioblastoma (brain cancer) and acute myeloid leukemia (AML). These mutations cause the enzyme to lose its normal activity of converting isocitrate to ␣-ketoglutarate (Figure 5.7) and instead convert the substrate to an abnormal metabolite called 2-hydroxyglutarate (2-HG), which accumulates to high levels in the tumor. The elevated levels of 2-HG have an impact on a number of processes including histone demethylation and DNA methylation. It is proposed that the disruption of these epigenetic processes would likely result in the aberrant regulation of gene expression within tumor cells.

5

Mutations in PI3K or PKB/AKT are oncogenic for reasons other than just their roles in cell survival. For example, the PI3K pathway is a major driver of the aerobic glycolytic pathway characteristic of tumor cells (page 667). It does this by activating the transcription factors HIF and MYC, which turn on the expression of genes encoding glucose transporters and glycolytic enzymes.

The Mutator Phenotype: Mutant Genes Involved in DNA Repair If one considers cancer as a disease that results from alterations in the DNA of somatic cells, then it follows that any activity that increases the frequency of genetic mutations is likely to increase the risk of developing cancer. As discussed in Chapter 13, nucleotides that become chemically altered or nucleotides that are incorporated incorrectly during replication are selectively removed from the DNA strand by DNA repair. DNA repair processes require the cooperative efforts of a substantial number of proteins, including proteins that recognize the lesion, remove a portion of the strand containing the lesion, and replace the missing segment with complementary nucleotides. If any of these proteins are defective, the affected cell can be expected to display an abnormally high mutation rate, which is described as a “mutator phenotype.” Cells with a mutator phenotype are likely to incur secondary mutations in both tumor-suppressor genes and oncogenes, which increases their risk of becoming malignant. MicroRNAs: A New Player in the Genetics of Cancer Recall from Section 11.5 that microRNAs are tiny regulatory RNAs that negatively regulate the expression of target mRNAs. Given that cancers arise as the result of abnormal gene expression, it would not be surprising to discover that miRNAs are somehow involved in tumorigenesis. In 2002, it was reported that the locus that encodes two microRNAs, miR-15a and miR-16, was either deleted or underexpressed in most cases of chronic lymphocytic leukemia. It was subsequently shown that these two miRNAs act to inhibit expression of the mRNA that encodes the antiapoptotic protein BCL-2, a known proto-oncogene. In the absence of the miRNAs, the oncogenic BCL-2 protein is overexpressed, which promotes development of leukemia. Because these miRNAs act to inhibit tumorigenesis, they can be thought of as tumor suppressors. When the leukemic cells lacking miR-15a and miR-16 were genetically engineered to reexpress these RNAs, they underwent apoptosis as would be expected if a missing tumor suppressor activity was restored. The locus that encodes miR-15a and miR-16 is also deleted in other types of cancers, suggesting it has widespread importance in tumor suppression. The expression of two of the most important human oncogenes, RAS and MYC, have also been shown to be inhibited

16.3 The Genetics of Cancer

Oncogenes That Encode Products That Affect Apoptosis Apoptosis is one of the body’s key mechanisms to rid itself of tumor cells at an early stage in their progression toward malignancy. Consequently, any alteration that diminishes a cell’s ability to self-destruct would be expected to increase the likelihood of that cell giving rise to a tumor. This was evident in the previous discussion of the role of the PI3K/PKB pathway in cell survival and tumorigenesis (page 678). With this in mind, it is not surprising that both PI3K and PKB are encoded by documented oncogenes.5 The oncogene most closely linked to apoptosis is BCL-2, which encodes a membrane-bound protein that inhibits apoptosis (page 659). The role of BCL-2 in apoptosis is most clearly revealed in the phenotypes of knockout mice that are lacking a Bcl-2 gene. Once formed, the lymphoid tissues of these mice undergo dramatic regression as the result of widespread apoptosis. Like MYC, the product of the BCL-2 gene becomes oncogenic when it is expressed at higher-than-normal levels, as can occur when the gene is translocated to an abnormal site on the chromosome. Certain human lymphoid cancers (called follicular B-cell lymphomas) are correlated with the translocation of the BCL-2 gene next to a gene that codes for the heavy chain of antibody molecules. It is suggested that overexpression of the BCL-2 gene leads to the suppression of apoptosis in lymphoid tissues, allowing abnormal cells to proliferate to

form lymphoid tumors. The BCL-2 gene may also play a role in reducing the effectiveness of chemotherapy by keeping tumor cells alive and proliferating despite damage by the drug treatment. Over the past few pages, we have discussed a number of the most important tumor suppressors and oncogenes involved in tumorigenesis. Figure 16.18 provides a simplified overview of some of these proteins and the signaling pathways in which they operate. Tumor suppressors and tumorsuppressive pathways are shown in red, oncogenes and tumorpromoting pathways are shown in blue. The basic functions of each of the tumor suppressors and oncogenes depicted in the figure are noted in the legend of Figure 16.18. The remarkable diversity of protein activities that can contribute to tumorigenesis is evident from this figure.

682 Growth Factor Receptor (e.g., EGFR or insulin receptor)

PI3K

PKB (AKT)

MYC

NF1

RAS

PTEN

Telomerase

RAF

MDM2

IMMORTALIZATION

Mitochondrion

pRB p53 E2F DNA

BRCA BCL-2

Break in DNA Transcription and translation of cell-cycle inhibitors, e.g., p16, p21, pRB

Transcription and translation of cell-cycle promoters, e.g., CDK4 and CYCLIN D1

SENESCENCE

PROLIFERATION

Chapter 16 Cancer

p21

Figure 16.18 An overview of several of the signaling pathways involved in tumorigenesis that were discussed in this section. Tumor suppressors and tumor suppression are shown in red, whereas oncogenes and tumor stimulation are shown in blue. Arrows indicate activation, perpendicular lines indicate inhibition. Among the proteins depicted in this figure are transcription factors (p53, MYC, and E2F), a transcriptional coactivator or corepressor (pRB), a lipid kinase (PI3K) and lipid phosphatase (PTEN), a cytoplasmic tyrosine kinase (RAF) and its activator (RAS), a GTPase activating protein for RAS (NF1), a protein kinase that promotes cell survival (PKB/AKT), a protein that senses

DNA breaks (BRCA), subunits of a cyclin-dependent kinase (CYCLIN D1 and CDK4), a Cdk inhibitor (p21), an antiapoptotic protein (BCL-2), a ubiquitin ligase (MDM2), an enzyme that elongates DNA (telomerase), and a protein that binds growth factors (e.g., EGFR). The arrows and lines do not necessarily represent direct activation or inhibition. For example, PTEN inhibits PKB through removal of a phosphate from PIP3 and EGFR activates RAS via GRB2 and SOS. The dashed line indicates indirect action by activation of expression of the MYC gene.

by an miRNA, namely, let-7, which was the first miRNA to be discovered (page 459). Some miRNAs act more like oncogenes than tumor suppressors. One specific cluster of miRNA genes, for example, is overexpressed during the formation of certain human lymphomas. Overexpression of these miRNAs can occur because the gene cluster encoding them is present in increased number (amplified) in the tumor cells, or it can occur because the gene cluster is excessively transcribed as the result of overly active transcription factors, including MYC. When mice are genetically engineered to overexpress these particular miRNAs, the animals develop lymphomas as would be predicted if the genes encoding them were acting as oncogenes. Overexpression can be oncogenic when the miRNAs target

the mRNAs of key tumor suppressor genes, such as TP53. This appears to be the case in many neuroblastomas where the TP53 gene is not mutated, but it is not expressed at normal levels. The abnormal expression of miRNAs has also been implicated as a causal factor in tumor cell invasiveness and metastasis, which elevates the interest in these RNAs to an even higher level. It is not yet clear how important miRNAs are in the overall occurrence of human cancer, but a number of microarray studies that survey large numbers of these tiny regulatory RNAs suggest that most human cancers have a characteristic miRNA expression profile, just as they have a characteristic mRNA expression profile. Studies suggest that miRNA expression profiles may serve as sensitive and accurate biomarkers

683

to identify the exact type of tumor a person is suffering from and the best avenue of treatment. These studies also suggest that miRNAs may serve as potential targets for anticancer therapeutics. It may be possible, for example, to treat patients with synthetic RNAs that act as “miRNA sponges.” Therapeutic RNAs of this class would possess sequences that were complementary to oncogenic miRNAs and would act to bind and sequester such miRNAs, thereby blocking their carcinogenic activity.

The Cancer Genome All cancers arise as the result of genetic alterations. As was evident from the previous discussion, the genes involved in tumorigenesis constitute a specific subset of the genome whose products are involved in such activities as the progression of a cell through the cell cycle, adhesion of a cell to its neighbors, apoptosis, and repair of DNA damage. Taken together, more than 350 different genes have been identified as “cancer genes,” that is, genes that are thought to have some causal role in the development of at least one type of malignancy. Over the past few years a concerted effort has been made to determine which of these genes are altered—either by point mutation, translocation, deletion, or duplication—in various types of tumors. This effort has been aided by recent advances in DNA sequencing that allow researchers to determine the nucleotide sequences of specific regions of the genome much more rapidly and inexpensively than ever before. It was hoped that most types of cancers would be characterized by alterations in a relatively small number of genes. It has been known for quite a while, for example, that a large percentage of melanomas exhibit mutant BRAF genes and a

PIK3CA

FBXW7

large percentage of colorectal cancers exhibit mutant APC genes. Similarly, the various types of leukemias are characterized by specific translocations, as exemplified by the BCRABL translocation in CML (page 690). In all of these cases, these mutations occur at an early stage in tumor development and are thought to be critically important in turning the cell toward an eventual malignant state. However, results from the initial studies of cancer genomes suggest that the same types of tumors taken from different patients possess widely divergent combinations of aberrant genes. This observation presumably reflects the many different routes that individual tumors can take to escape the cell’s normal antitumor protections. These findings can be displayed as shown in Figure 16.19, where mutated genes identified in a large number of colorectal cancers are shown as peaks within a two-dimensional “mutational landscape.” The height of the various peaks reflects the frequency with which that particular gene is mutated in this particular type of cancer. It is evident from this type of display that a small number of genes are mutated in a large proportion of tumors; these can be thought of as “mountains” in the landscape. For the most part, these are the oncogenes that cancer researchers have been focusing on over the years. Each type of cancer has its own characteristic complement of frequently-mutated genes. In the case of human colorectal cancers, the three most frequently mutated genes shown in Figure 16.19, namely APC, KRAS, and TP53, tend to become mutated at distinct stages in the progression of this type of cancer (Figure 16.20). Mutations in both copies of the APC gene are found in over 60 percent of the smallest benign adenomas of the colon, suggesting that mutations in this gene often represent a first step in the formation of colon cancers. Larger adenomas as well as cell masses in the early stages of

TP53 KRAS APC

(a)

each of these individual tumors are indicated by the white circles. It is evident that very few mutations are shared between the tumors of these two individuals. In the example depicted here, only the APC and TP53 genes are mutated in both cases of the disease. (Note: The positions of the genes in this two-dimensional landscape are ordered with loci from one end of chromosome 1 at the bottom left of the landscape, proceeding through each of the autosomes in ascending order, until finally reaching the loci from chromosome X at the right edge of the landscape.) (FROM LAURA D. WOOD ET AL., COURTESY OF BERT VOGELSTEIN, SCIENCE 318:1113, 2007; © 2007. REPRINTED WITH PERMISSION FROM AAAS.)

16.3 The Genetics of Cancer

Figure 16.19 The genomic landscape of colorectal cancers. These two-dimensional maps depict the genes most frequently mutated in colorectal tumors. Each reddish projection represents a different gene. The five genes that are mutated in a large proportion of tumors are represented by the tallest projections, referred to as “mountains,” and are specifically named. The 50 or so other genes that are mutated at a much lower frequency constitute the smaller “hills” of the genomic landscape. To depict the degree to which colorectal tumors from different patients share commonly mutated genes, the mutational landscapes from two individual tumors (identified as Mx38 and Mx32) are depicted in this illustration. Genes that were found to be somatically mutated in

(b)

684 Increasing chromosomal instability

KRAS

APC

Chapter 16 Cancer

Normal epithelium

Loss of 18q SMAD4 CDC4

Early adenoma and dysplastic crypt

Intermediate adenoma

TP53 Late adenoma

Cancer

Figure 16.20 A model describing the sequence of genetic mutations that often occurs during the development of colon cancer. The histological changes that occur at the various stages in the development of this cancer are indicated by the drawings. Adenomas are benign growths (polyps) that, if not removed during a colonoscopy, have the potential to develop over the years into malignant tumors. The genes indicated at each step in the development of these tumors are some of the primary

drivers of colorectal tumorigenesis, as discussed in the text. About 70 percent of colon cancers exhibit chromosome instability and aneuploidy. (A. WALTHER, ET AL., NATURE.COM; NATURE REVS CANCER 9:491, 2009, FIGURE 1. NATURE REVIEWS CANCER BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/ TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

malignancy tend to contain mutations in one of the RAS oncogenes, namely KRAS. In contrast, the TP53 gene tends to mutate (or become epigenetically silenced) only at later stages along the path when the tumor is clearly malignant. Cells displaying chromosome instability, which is reflected in increasingly abnormal karyotypes, only appear after mutations in KRAS. Figures 16.19 and 16.20 illustrate how a small number of genes tend to be mutated at relatively high frequency in colorectal cancer. However, a surprisingly large number of genes are mutated at a lower frequency (less than 5 percent of cases); these have been described as “hills.” There are approximately 50 different genes representing the hills in the mutational landscape of colorectal cancers in Figure 16.19. While we can presume that the genes that are mutated at high frequency (the mountains) are important factors in driving the cells into malignancy, considerable debate is centered on the roles of the genes that are mutated at lower frequency (the hills). Many of the genes represented by the hills almost certainly have a causal role in determining properties of the malignant phenotype, even if they provide the tumor with only a small selective advantage. Mutations that cause or contribute to the malignant phenotype are described as drivers. Other genes that constitute a hill may simply represent “passengers,” that is, genes that tend to become mutated for some reason but have no effect on the phenotype of the cancer cell. It may be a daunting challenge to determine which of these genes are drivers and which are simply passengers. In addition to the genes of the mountains and hills, there is a large number of other genes that show up in a mutant state at very low frequency in a population of tumors. Mutations of this class can be seen in Figure 16.19 as the small circles scattered over each landscape that are not at the base of a hill or mountain. Mutated genes of this class are

presumed products of the overall genetic instability that characterizes cancer cells. Figure 16.19a,b shows the mutational landscapes of colorectal tumors from two different individuals. On average, an individual tumor carried approximately 80 different mutations. Close examination of the locations of the white circles in the two landscapes indicates that only a very small number of mutated genes in these cancers are shared by the two individuals. Thus, in a sense, each person suffers from his or her own distinct type of disease. Even within a given patient, metastatic lesions that have spread to different regions of the body can have remarkably different complements of mutated genes. These differences presumably reflect the different genetic changes that occur during the growth of each secondary tumor. It is evident from this and a large number of other studies on other types of cancers that the mutational landscape of the cancer genome is highly complex. However, a closer analysis of the mutations that make up the mountains and hills suggests that most of these large numbers of genes encode proteins that participate as components of a relatively small number of pathways. In one study of persons with pancreatic cancer, which is one of the most deadly and least treatable types, recurrent mutations were found in more than 60 genes, but, for the most part, these mutations affected a core set of 12 cellular pathways or processes (Table 16.2). Most importantly, half of these pathways/processes were disrupted in 100 percent of the tumor samples. In a comparable study of glioblastoma, the most common form of brain cancer, the majority of tumors exhibited mutations that affected three major pathways— p53, pRB, and PI3K. Thus, as discussed elsewhere in this chapter, cancer can be thought of not simply as a disease of aberrant genes but as one of aberrant cellular pathways. Mutations in any one of a number of genes disrupt the same

685

Table 16.2 Core Signaling Pathways and Processes Genetically Altered in Most Pancreatic Cancers

Regulatory process or pathway

Apoptosis DNA damage control Regulation of G1/S phase transition Hedgehog signaling Homophilic cell adhesion Integrin signaling c-Jun N-terminal kinase signaling KRAS signaling Regulation of invasion Small GTPase–dependent signaling (other than KRAS) TGF-␤ signaling Wnt/Notch signaling

Number of genetically altered genes detected

Fraction of tumors with genetic alteration of at least one of the genes

9 9 19

100% 83% 100%

19 30 24 9

100% 79% 67% 96%

5 46 33

100% 92% 79%

37 29

100% 100%

From S. Jones et al., Science 321:1805, 2008; copyright 2008. Science by Moses King, Reproduced with permission of American Association for the Advancement of Science in the format reuse in a book/textbook via Copyright Clearance Center.

Gene-Expression Analysis A technology for analyzing gene expression using DNA microarrays (or DNA chips), was described in detail on page 515. Briefly, a glass slide is prepared that can contain anywhere from a handful to thousands of spots of DNA, each spot containing the DNA corresponding to a single, known gene. Any particular set of genes, such as those thought to be involved in growth and division, or those thought to be involved in the development and differentiation of lymphocytes or some other cell type, can be included within the microarray. Once prepared, the microarray is incubated with fluorescently labeled cDNAs synthesized from the mRNAs of a particular population of cells, such as those from a tumor mass that has been removed during surgery, or from the cancerous blood cells of a patient with leukemia. The fluorescently labeled cDNAs hybridize to spots of complementary DNA immobilized on the slide, and subsequent analysis of the fluorescence pattern tells the investigators which mRNAs are present within the tumor cells and their relative abundance within the mRNA population. Studies with DNA microarrays have shown that geneexpression profiles can provide invaluable information about the properties of a tumor. It has been found, for example, that (1) progression of a tumor is correlated with a change in expression of particular genes, (2) certain cancers that appear to be similar by conventional criteria can be divided into subtypes that have different clinical features on the basis of their gene-expression profiles, (3) the gene-expression profile of an individual patient’s tumor can reveal how aggressive (i.e., how lethal) the cancer is likely to be, (4) the gene-expression profile of an individual patient’s tumor can provide clues as to which type of therapeutic strategy will be most likely to induce tumor regression. We can look at a few of these issues in more detail. Figure 16.21 depicts the levels at which 50 different genes were found to be transcribed in two different types of leukemia, acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML). The genes used in this figure, which are named on the right side, represent those whose transcription showed the greatest difference between these two types of blood-cell cancers. Each column represents the results from a single patient with ALL or AML, so that different columns allow one to compare the similarities in expression of the gene from one patient to the next. Gene-expression levels are indicated from dark blue (lowest level) to dark red (highest level). The top half of the figure identifies genes that are transcribed at a much higher level in ALL cells, whereas the bottom half identifies genes that are transcribed to a much higher level in AML cells. These studies make it evident that there are many differences in gene expression between different types of tumors. Some of

16.3 The Genetics of Cancer

pathway and thus have the same consequence for the cells. For example, the PI3K pathway in breast cancer can be activated by mutations in the catalytic subunit of PI3K (PIK3CA), amplification of the growth factor receptor HER2, mutations in the PI3K effector PKB (AKTI), or loss of the PI3K antagonist PTEN. A few of the most prominent pathways that tend to be deregulated in many different types of cancers were indicated in Figure 16.18. Looking at cancer as a “pathway” disease rather than a “genetic” disease generates more optimism among drug developers because it suggests that disrupting (or restoring) any one of the key steps in a single essential pathway may be sufficient to derail the malignant cells and lead to tumor regression. Before leaving the subject of cancer genetics, it should be noted that the view presented here—that cancer is a gradual multistep progression of individual point mutations—is not universally shared. Some researchers argue that the mutation rate in humans is not high enough for cells to accumulate the mutations necessary to become fully malignant during an individual’s lifetime. Instead, they have proposed that carcinogenesis is initiated by catastrophic events that lead to widespread genetic instability over a relatively small number of cell divisions. For example, mutations in a gene involved in DNA replication or DNA repair, as occurs in cases of HNPCC, might rapidly spawn cells carrying widespread genetic abnormalities. According to another proposal, cells that have undergone an abnormal cell division and possess aberrant numbers of chromosomes, are likely initiators of cancerous growths. The best way to decide among these possibilities is to analyze the state of the genome in cells at very early stages in tumor formation. Unfortunately, for the interest of both researchers and cancer patients, it is virtually impossible

to identify tumors when they are composed of a small number of cells. By the time that tumors are detected, the cells already exhibit a high degree of genetic derangement, making it difficult to determine whether these genetic alterations are a cause or an effect of tumor growth.

686 ALL

−3

Chapter 16 Cancer

Low

−2.5

−2

−1.5

−1

AML

−0.5

0

0.5

1

Normalized Expression

1.5

2

2.5

3 High

Figure 16.21 Gene-expression profiling that distinguishes two types of leukemia. Each row depicts the expression level of a single gene that is named to the right of the row. Altogether the levels of expression of 50 different genes are indicated. The color key is shown at the bottom of the figure, indicating the lowest level of expression is in dark blue and highest in dark red. Each column represents data from a different sample (patient). The columns on the left show the expression profiles from patients with acute lymphoblastic leukemia (ALL), whereas the columns on the right show the profiles from patients with acute myeloid leukemia (AML) (indicated by the brackets at the top).

It is evident that the genes in the upper box are expressed at a much greater level in patients with ALL, whereas the genes in the lower box are expressed at a much greater level in patients with AML. (The genes selected for inclusion in this figure were chosen for the illustration because of these differences in expression between the two diseases.) (FROM T. R. GOLUB ET AL., SCIENCE 286:534, 1999; COPYRIGHT 1999, SCIENCE BY MOSES KING. REPRODUCED WITH PERMISSION OF AMERICAN ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

these differences can be correlated with biological and clinical differences between the tumors; one, for example, is derived from a myeloid cell and the other from a lymphoid cell (see Figure 17.6). Most differences, however, cannot be explained. Why, for example, is the gene encoding catalase (the

last gene on the list) expressed at a low level in ALL and a high level in AML? Even if these studies can’t answer this question, they can provide cancer researchers with a list of genes to look at more closely as potential targets for therapeutic drugs.

687 Lymph-Node–Negative Patients

Lymph-Node–Positive Patients 100

Good signature

80 60 40

Poor signature

20 0

2 4

Overall survival (%)

Overall survival (%)

100

6 8 10 12

60 Poor signature 40 20 0

Years (a)

Good signature

80

2 4

6 8 10 12 Years

(b)

Figure 16.22 The use of DNA microarray data in determining the choice of treatment. Each graph shows the survival rate over time in breast cancer patients that had either a good-prognosis signature or a poor-prognosis signature based on the level of expression of 70 selected genes. The patients in a showed no visible evidence that the cancer had spread to nearby lymph nodes at the time of surgery. As indicated in the plot: (1) not all of these patients survive and (2) the likelihood of survival can be predicted to a large extent by the gene-expression profiles of their tumors. This allows physicians to treat those patients with a poor signature more aggressively than those with a good signature. The patients in b did show visible evidence of spread of cancer cells to nearby lymph nodes. As indicated in the plot, the likelihood of survival in this group can also be predicted by gene-expression data. Normally, all of the patients in this group would be treated very aggressively, which may not be necessary for those with a good signature. (FROM M. J. VAN DE VIJVER ET AL., NEW ENGLAND JOURNAL MEDICINE 347:2004, 2005, COPYRIGHT 2002, THE NEW ENGLAND JOURNAL OF MEDICINE BY MASSACHUSETTS MEDICAL SOCIETY REPRODUCED WITH PERMISSION OF MASSACHUSETTS MEDICAL SOCIETY, IN THE FORMAT REUSE IN A STANDARD/CUSTOM BOOK (BASIC RIGHTS) VIA COPYRIGHT CLEARANCE CENTER.)

REVIEW 1. Contrast a benign tumor and malignant tumor; tumorsuppressor genes and oncogenes; dominantly acting and recessively acting mutations; proto-oncogenes and oncogenes. 2. What is meant by the statement that cancer arises as the result of a genetic progression? 3. Why is p53 described as the “guardian of the genome”? 4. Give three mechanisms by which p53 acts to prevent a cell from becoming malignant. 5. How can DNA microarrays be used to determine the type of cancer a patient suffers from? How might they be used to optimize cancer treatment? 6. What types of proteins are encoded by protooncogenes, and how do mutations in each type of proto-oncogene cause a cell to become malignant?

16.4 | New Strategies for Combating Cancer It is painfully evident that conventional approaches to combating cancer, namely, surgery, chemotherapy, and radiation, are not usually successful in curing a patient of metastatic cancer, that is, one that has already spread from a primary tumor. Because they kill large numbers of normal cells, along with the cancer cells, chemotherapy and radiation tend to have serious side effects, in addition to having limited curative value for most advanced cancers. There has been hope for decades that these “brute-force” strategies would be replaced by targeted therapies, based on our newly formed insights into the molecular basis of malignancy. There are several ways that a therapy can be considered to be “targeted”: it can be targeted to attack only cancer cells, leaving normal cells unscathed; it can be targeted to a particular protein whose inactivation leaves the cancer cells unable to grow or survive; and/or it can be targeted to the cancer cells of a particular patient based on their unique pattern of somatic mutations. Although the cure rate for most types of cancers has not improved significantly over the past 50 years, there is reason to believe that effective targeted therapies will become available to treat many of the common cancers in the foreseeable future. This optimism is based largely on the remarkable success that has been achieved with a small number of targeted treatments that will be discussed in the following pages. Even though these successes have been scat-

16.4 New Strategies for Combating Cancer

The earlier a cancer is discovered, the more likely a person can be cured; this is one of the cardinal principles of cancer treatment. Yet a certain percentage of tumors will prove fatal, even if discovered and removed at an early stage. It is apparent, for example, that some early-stage breast cancers already contain cells capable of seeding the formation of secondary tumors (metastases) at distant locations, whereas others do not. These differences determine the prognosis of the patient. It was found in a landmark study by a team of Dutch researchers in 2002 that the prognosis of a given breast cancer is likely to be revealed in the level of expression of approximately 70 genes out of the thousands that were studied in DNA microarrays (Figure 16.22). This finding has had important clinical applications in guiding the treatment of breast cancer patients. Those patients with early-stage tumors that display a “poor prognosis” profile on the basis of their gene-expression pattern (Figure 16.22a) can be treated aggressively with chemotherapy to maximize the chance that the formation of secondary tumors can be prevented. If gene-expression data is not considered, these individuals are not likely to receive any type of chemotherapy because there is no indication using conventional criteria that their tumor had spread. Conversely, those patients whose tumors exhibit a “good prognosis” profile might be spared the more debilitating chemotherapeutic drugs, even if their tumors appear more advanced (Figure

16.22b). In recent years, several companies have introduced laboratory tests (e.g., MammaPrint and Oncotype DX) to analyze gene-expression profiles of individual breast cancers in an attempt to help guide the best course of treatment for these patients. Even though these prognostic tests have been in widespread use for several years, their validity is still being evaluated in large clinical trials. It is hoped that gene-expression profiles can soon be used to improve the diagnosis and optimize treatment for individual patients with all types of cancer.

688

tered among a much larger number of failed prospective treatments, they demonstrate that the concept of targeted therapy is sound. In other words, these successes can be considered as “proof-of-principle.” Just as important, they give both researchers and biotechnology companies the incentive to spend time and money to continue the pursuit of better cancer treatments. The anticancer strategies to be discussed in the following sections can be divided into three groups: (1) those that depend on antibodies or immune cells to attack tumor cells, (2) those that inhibit the activity of cancer-promoting proteins, and (3) those that prevent the growth of blood vessels that nourish the tumor.

Chapter 16 Cancer

Immunotherapy We have all heard or read about persons with metastatic cancer who were told they had only months to live, yet they defy the prognosis and remain alive and cancer-free years later. The best studied cases of such “spontaneous remissions” were recorded in the late 1800s by a New York physician named William Coley. Coley’s interest in the subject began in 1891 when he came across the hospital records of a patient with an inoperable tumor of the neck, who had gone into remission after contracting a streptococcal infection beneath the skin. Coley located the patient and found him to be free of the cancer that had once threatened his life. Coley spent his remaining years trying to develop a bacterial extract that, when injected under the skin or into a tumor, would stimulate a patient’s immune system to attack and destroy the malignancy. The work was not without its successes, particularly against certain uncommon soft-tissue sarcomas. Although the use of Coley’s toxin, as it was later called, was never widely embraced, many studies have confirmed anecdotal observations that the body has the capability to destroy a tumor, even after it has become well established. In recent years, two broad treatment strategies involving the immune system have been pursued: passive immunotherapy and active immunotherapy. Passive immunotherapy is an approach that attempts to treat cancer patients by administering antibodies as therapeutic agents. These antibodies recognize and bind to specific proteins on the surface of the tumor cells being targeted. Once bound, the antibody either kills the cells directly or orchestrates an attack on the cell carried out by other elements of the immune system. As discussed in Section 18.18, the production of monoclonal antibodies capable of binding to particular target antigens was initially developed in the mid-1970s. During the next 20 years or so, attempts to use these proteins as therapeutic agents were foiled for a number of reasons. Most importantly, these antibodies were produced by mouse cells and encoded by mouse genes. As a result, the antibodies were recognized as foreign and cleared from the bloodstream before they had a chance to work. In subsequent efforts, investigators were able to produce “humanized antibodies,” which are antibodies that are largely human proteins except for a relatively small part that recognizes the antigen, which remains “mousey.” In the past few years, researchers have been able to

produce antibodies that have a completely human amino acid sequence. Herceptin is a humanized antibody directed against a cell-surface receptor (HER2) that binds a growth factor that stimulates the proliferation of breast cancer cells. Herceptin is thought to inhibit activation of the receptor by the growth factor and stimulate receptor internalization (page 312). Approximately 25 percent of breast cancers are composed of cells that overexpress the HER2 gene, which causes these cells to be especially sensitive to growth factor stimulation. Up until the development of Herceptin, patients whose tumors overexpressed HER2 had a very poor prognosis. Their prognosis is now greatly improved. For example, one study of 3000 women with early-stage breast cancer reported that Herceptin reduced the chance of recurrence of the disease by about 50 percent in a four-year period. Recent trials suggest that survival can be increased by combining Herceptin with another monoclonal antibody (Omnitarg) that interferes with HER2 dimerization. To date, the most effective humanized antibody is Rituxan, which was approved in 1997 for treatment of nonHodgkin’s B-cell lymphoma. Rituxan binds to a cell-surface protein (CD20) that is present on the malignant B cells in approximately 95 percent of the cases of this disease. Binding of the antibody to the CD20 protein inhibits cell growth and induces the cells to undergo apoptosis. Introduction of this antibody (together with chemotherapy) has reversed the prospects for people with this particular cancer. Over the past few years, a number of fully human antibodies have been tested in clinical trials against various types of cancers. One of these, called Vectibix, which is directed against the EGF receptor, has been approved as a single-agent treatment of EGFR-expressing metastatic colon cancer. Because it is a human protein, Vectibix remains in the circulation long enough to be administered once every other week. Two other human monoclonal antibodies, Arzerra and Yervoy, have been approved for treatment of chronic lymphocytic leukemia and melanoma, respectively. Like Rituxan, Arzerra binds to CD20 on the surface of B cells. Yervoy acts by blocking CTLA-4, a protein that normally inhibits the body’s T cells from carrying out an immune response (Section 17.4). In addition, a number of antibodies are being developed that contain a radioactive atom or a toxic compound conjugated to the antibody molecule. The antibody targets the complex to the cancer cell, and the associated atom or compound kills the targeted cell. At the time of this writing, two radioactively labeled anti-CD20 antibodies (Zevalin and Bexxar) have been approved for the treatment of non-Hodgkin’s B-cell lymphoma, and one anti-CD30 antibody linked to a toxic compound (Adcetris) has been approved to treat Hodgkin’s lymphoma. Active (or adoptive) immunotherapy is an approach that tries to get a person’s own immune system more involved in the fight against malignant cells. The immune system has evolved to recognize and destroy foreign materials, but cancers are derived from a person’s own cells. Although many tumor cells do contain proteins that are not expressed by their normal counterparts (so-called tumor-associated antigens), or mutated proteins (e.g., BRAFV600E) that are different from

689

University of Pennsylvania first isolated T cells from the blood of each of the patients. As discussed in Chapter 17, T cells carry a multisubunit protein on the surface of their cells called the T cell receptor (or TCR). The TCR determines which specific molecules (antigens) that a given T cell will react with and, consequently, which target cells that T cell will attack and likely kill. For the most part, T cells do not possess TCRs that would allow them to react with other cells in the body, which protects the body from self-attack. In this clinical trial, researchers isolated T cells and then genetically engineered these cells, using a disabled HIV1 viral vector, to carry a specific, high-affinity TCR that would allow the T cells to react with and kill host cells that possess a CD19 protein on their surface. The only cells that normally carry CD19 are B cells–both the normal versions and those that are part of the CLL cancer. Once these T cells were genetically modified, their numbers were greatly expanded in culture and were then infused back into the patient from which the T cells were originally obtained. Within a week or two, the patients were stricken with a sickness characterized by very high fever, lowered blood pressure, and kidney distress. These symptoms were the result of the war going on within their bodies— between the infused T cells (and the massive numbers of their progeny formed in the body after infusion) and the huge number of cancerous cells that these patients carried. It was estimated that each infused T cell was responsible for the death of more than 1000 CLL cells. In the end, the patients were left with a relatively small number of genetically engineered T memory cells derived from the infusion, a virtual absence of normal B cells due to their destruction by the T cells, and either the absence of malignant cells or, in one case, a greatly reduced tumor burden. The presence of the memory T cells should prevent the reemergence of CLL, as well as the suppression of formation of normal B cells, which might make these patients more susceptible to infection. Keep in mind that the use of genetically modified T cells carries considerable risk, because these T cells can attack and kill any cell in the body that happens to carry the targeted antigen. As a result, these types of cell-based immunotherapies have the potential to cause severe autoimmune side effects. Whether or not the approach used in this study can be safely applied to other types of cancers is an unanswered question.

Inhibiting the Activity of Cancer-Promoting Proteins Cancer cells behave as they do because they contain proteins that are either present at abnormal concentration or display abnormal activity. A number of these proteins were illustrated in Figure 16.17. In many cases, the growth and/or survival of tumor cells is dependent on the continued activity of one or more of these deviant proteins. This dependency is known as “oncogene addiction.” If the activity of one of these proteins can be selectively blocked, it should be possible to kill the entire population of malignant cells. With this goal in mind, researchers have synthesized a virtual arsenal of small-molecularweight compounds that inhibit the activity of cancer-promoting proteins. Some of these drugs were custom-designed to inhibit a particular protein of known structure, whereas others

16.4 New Strategies for Combating Cancer

those present in normal cells, they are still basically host proteins present in host cells. As a result, the immune system typically fails to recognize these proteins as inappropriate. Even if a person does possess immune cells that recognize certain tumor-associated antigens, tumors evolve a number of mechanisms that allow them to escape immune destruction. Many different strategies have been formulated to overcome these hurdles and artificially stimulate the immune system to mount a more vigorous response against tumor cells. One approach is to inoculate a patient with a protein known to be present in their cancer cells, such as HER2 in the case of a person suffering from breast cancer or CEA (carcinoembryonic antigen) in a person with pancreatic cancer. If the body can be stimulated to mount an immune response against the protein, that response has the potential to attack the cancer cells that have that protein on their surface. In most immunotherapeutic approaches, immune cells are isolated from the patient, modified or stimulated in one way or another in vitro, allowed to proliferate in culture, and then reintroduced into the patient. For many years, clinical trials using this strategy were disappointing, but recent publications have provided reason for cautious optimism. In many of these studies, a significant minority of patients have exhibited a positive response to the treatment, which means that their tumors have at least shrunk in size or extent and the patient’s expected time of survival has increased significantly. Why some patients respond, while most do not, remains a focus of investigation. We will briefly consider two of the most promising examples of this general type of cellbased immunotherapy. Because both of the treatments to be described have to be personalized to the individual patient, they are likely to be extremely expensive should they ever become approved and widely available. DCVax is currently in phase II trials for the highly lethal brain tumor, glioblastoma. DCVax utilizes dendritic cells that have been taken from the patient’s blood, stimulated in vitro with antigens from the patient’s own tumor, and then injected back into the patient’s body. Because the strategy involves exposure of dendritic cells to specific proteins with the goal of generating an immune response, DCVax is described as a “cancer vaccine.” The “manufacturing process” takes about 10 days and provides enough cells for several years of treatment. Although the number of patients that have been treated to date is limited to a few dozen, the vaccine appears to have extended the median survival time of patients to more than 3 years from time of diagnosis. Without such treatment, these patients would presumably have died in about 15 months. As of July, 2010, 33 percent of treated patients had survived at least 4 years and 27 percent had survived at least 6 years. Fewer than 5 percent of patients given conventional treatments survive 5 years. In August 2011, the results of a phase I trial carried out on 3 patients with advanced chronic lymphocytic leukemia (CLL) was reported both in the literature and in the news media. Phase I trials are normally carried out to determine 1) whether a drug is safe and 2) its proper dosage. In this case, the treatment resulted in the complete remission of disease in 2 of the patients and at least partial remission in the third. To carry out the treatment, Carl June and his colleagues at the

690

Chapter 16 Cancer

Table 16.3 Examples of Small-Molecule Targeted Therapies That have Either Been FDA Approved or Are Being Tested Drug

Target

Mechanism of action

Gleevec, Tasigna Sprycel Iressa, Tarceva, Zactima Sutent, Votrient Tykerb Nexavar Zelboraf GDC-0973 Velcade, Carfilzomib Zolinza, Istodax Erivedge Torisel, Afinitor BKM120, PX866 BEZ235, BGT226 Perifosine, MK2206 Trisenox (arsenic trioxide) Seliciclib Tamoxifen, Raloxifene Arimidex, Aromasin Zytiga Genasense, ABT-263 17-AAG Nutlins PRIMA-1 PX-478 Veliparib, Rucaparib Dacogen, Vidaza

BCR-ABL, KIT, PDGFR BCR-ABL, SRC family EGFR VEGFR, EGFR VEGFR, PDGFR, KIT EGFR, HER2 BRAF, KIT, VEGFR BRAFV600E MEK proteasome HDACs Smoothened of Hh pathway mTOR PI3K PI3K, mTOR PKB/AKT NF-␬B CDK2 Estrogen receptor Aromatase CYP17 BCL-2 HSP90 p53 p53 HIF-1 PARP-1 DNMT

Tyrosine kinase inhibitor Tyrosine kinase inhibitor Tyrosine kinase inhibitor Tyrosine kinase inhibitor Tyrosine kinase inhibitor Tyrosine kinase inhibitor Kinase inhibitor Kinase inhibitor Inhibits MAPK cascade Inhibits protein degradation Inhibits histone acetylation (epigenetic effect?) Blocks cell growth and survival pathway Blocks cell-survival pathway Blocks cell-survival pathway Blocks cell-survival pathway Blocks cell-survival pathway Blocks cell-survival pathway Induces apoptosis Blocks estrogen action Inhibits estrogen synthesis Inhibits androgen synthesis Induces apoptosis Inhibits molecular chaperone Inhibits p53-MDM2 interaction Restores mutant p53 activity Inhibits this transcription factor that is activated by hypoxia Blocks HR-based DNA repair Inhibits DNA methylation

were identified by randomly screening large numbers of compounds that have been synthesized by pharmaceutical companies. Once a protein-inhibiting compound has been identified, it is typically tested for effectiveness against a panel of approximately 60 different types of cultured cells, each originally isolated from a different human cancer. Using a diverse panel of cells allows researchers to reconstitute the genetic heterogeneity and diverse drug sensitivities found among human cancers. Success against cultured cells generally leads to tests of the agent in immunocompromised mice carrying human tumor transplants (xenografts). A list of some of the agents that have been tested in clinical trials is shown in Table 16.3. Although numerous compounds on this list have shown some promise in halting the growth of various types of tumors, one compound has had unparalleled success in treating patients with chronic myelogenous leukemia (CML). It was noted earlier that certain types of cancers are initiated by specific chromosome translocations. CML is caused by a translocation that brings a proto-oncogene (ABL) in contact with another gene (BCR) to form a chimeric gene (BCRABL). Blood-forming cells that carry this translocation express a high level of ABL tyrosine kinase activity, which causes the cells to proliferate uncontrollably and initiate the process of tumorigenesis. As discussed on page 75, a compound called Gleevec was identified that selectively inhibits

ABL kinase by binding to the inactive form of the protein and preventing its phosphorylation by another kinase, which is required for ABL activation. Initial clinical trials on Gleevec were remarkably successful, causing nearly all CML patients who received the drug at sufficiently high dose to go into remission. These studies confirmed the notion that elimination of a single required oncogene product could stop the growth of a human cancer. The drug was rapidly approved and has been in use for years. Patients have to continue to take the drug to remain in remission, and many of them, especially those who began treatment at an advanced stage, eventually develop drug resistance. Most cases of resistance result from mutations in the ABL portion of the fusion gene. This has prompted the development of a second generation of targeted inhibitors that remain active against most of the mutated forms of the ABL kinase (see Figure 2.52d ). These drugs appear to be effective in treating Gleevec-resistant cases of CML and suggest that the ideal drug regimen may consist of a cocktail of several different inhibitors that target different parts of the same protein, ensuring that drug-resistant mutants will not emerge. It was hoped that Gleevec would be followed rapidly by numerous other highly effective protein-inhibiting drugs. Although a number of protein-targeting, small-molecule inhibitors have shown modest success in clinical trials and have

691

been approved by the FDA, and hundreds of others are being tested in the clinic, none of them studied to date have been able to fully stop the growth of any of the common solid cancers, namely those of the breast, lung, prostate, or pancreas. One major success has been achieved (Figure 16.23). A drug called Zelboraf (PLX4032) has recently been approved for the treatment of patients with metastatic melanoma whose cancer is driven by a specific mutated version of BRAF, one in which the normal valine residue at position 600 is replaced by a glutamic acid (BRAFV600E), thereby changing the shape of the active site. This mutation occurs in approximately 50 percent of melanoma patients and Zelboraf specifically blocks the activity of this mutant enzyme without affecting the normal protein. As with Gleevec, patients develop resistance to Zelboraf but, unlike Gleevec, resistance appears much earlier (typically after 7 months of treatment) and, in most cases, results from mutations in other genes rather than secondary mutations in BRAF. Resistance to Zelboraf typically occurs because the MAPK pathway once again becomes constitutively active, bypassing the cells’ need for BRAF kinase activity, which continues to be inhibited by the drug. Studies are underway to combine Zelboraf with inhibitors of downstream components of the pathway, such as MAPK or ERK. It is unlikely that a tumor will harbor cells that are resistant to two drugs that target two different pathways or two different proteins in the same pathway. Consequently, an approach that uses a combination of drugs may be the best way to block the

(b)

Figure 16.23 Therapeutic effects of the drug Zelboraf (PLX4032) on patients with metastatic melanoma harboring a mutant BRAFV600E oncoprotein. These PET scans of a patient prior to (a) and 2 weeks after initiation of treatment (b) show the dramatic beneficial effects of this drug. The dark regions of the scan show the locations of the metabolically active cancerous lesions. Tumor regrowth occurs in many of the patients due to the emergence of drug-resistant clones. (FROM GIDEON BOLLAG, ET AL., NATURE 467:599, 2010; © 2010, REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LTD.)

16.4 New Strategies for Combating Cancer

(a)

emergence of resistant cells. This use of combined drug therapy has already proven highly effective in blocking the appearance of resistant strains of the HIV virus in infected patients and, like cancer cells, HIV is also highly mutable. Unfortunately, inhibiting some of these key signaling proteins in cancer cells can have serious side effects as a result of their actions in normal cells of the body. The reasons for the failure to develop drugs that can eradicate solid, epithelial-based tumors (carcinomas) are not entirely clear. One reason may be that these tumors are more complex genetically and the cells are not as dependent on a single oncogene product and aberrant signaling pathway as are the blood-cell cancers and melanoma. Another reason may be that only a fraction of the patients with a particular type of tumor are sensitive to a particular drug. This was originally suggested in studies of Iressa, an inhibitor of the tyrosine kinase of the EGF receptor (EGFR). Iressa was originally tested on lung cancer patients because these tumors were known to exhibit high levels of EGFR. Initial clinical trials found that approximately 10 percent of patients in the United States and 30 percent of Japanese patients responded positively to the drug, whereas the remaining population was unaffected. Subsequent studies on cancers that contained EGFR mutations indicated that targeting the mutant EGFR was only potentially successful if the patient had wild type KRAS. Those patients whose cancers also exhibited a mutant KRAS gene were nonresponders. As a result, lung (and colon) cancers are now tested for KRAS mutations before anti-EGFR therapy is initiated. More recently, it was discovered that the drug Xalkori is highly effective in treating a select subgroup of lung cancer patients that had normal EGFR genes, but expressed an EML4-ALK fusion protein that results from a chromosomal translocation. Xalkori inhibits ALK kinase. The drug was approved along with a “companion diagnostic” test that uses fluorescence in situ hybridization (FISH) to identify the presence of the rearrangement, which occurs in about 5 percent of patients. These findings verify the notion that not all cases of a particular cancer can be treated alike. Instead, targeted cancer therapies will ultimately have to be tailored to fit the specific genetic modifications present in the tumors of individual patients. With this in mind, a number of major cancer centers have begun programs to sequence a panel of selected genes in patients with certain cancers. This information is also being used to help select individuals for participation in clinical trials, thereby increasing the likelihood of finding drugs that are effective in treating subsets of patients with a certain type of cancer. Without this type of genetic screening, the positive signals for a drug that might be useful for a subset of patients can be drowned out by the lack of clinical benefit to the population as a whole. The current challenge is to identify genetic markers that have value in predicting the success or failure of a given therapy. We have focused in this section upon targeted therapies that inhibit proteins that are either abnormal themselves or are abnormally expressed in cancer cells. But there may also be proteins that have a normal structure and expression, yet for some reason play an important role in the lives of cancer cells. Inhibitors that target such proteins might have considerable

692

Chapter 16 Cancer

potential as therapeutic agents in the treatment of cancer. We have seen in this chapter how cancer cells contain mutations in a wide variety of genes, many of which lead to the overall inhibition of certain metabolic or signaling pathways. While this may help promote the growth and survival of these cancer cells, it also may cause them to become more dependent than normal cells upon other pathways that continue to operate in a normal fashion. Recent announcements hailing the promise of PARP-1 inhibitors in the treatment of several types of cancer provide an example of this type of reasoning. PARP (which is an acronym for poly(ADP-ribose) polymerase) is a poorly understood enzyme that is involved in numerous processes involving DNA metabolism, including DNA repair. PARP-1 inhibitors have shown particular promise in the treatment of breast and ovarian cancers that exhibit deficiencies in BRCA1 or BRCA2. As discussed on page 678, BRCA proteins are involved in DNA repair and it is reasoned that such cancer cells are more dependent than normal cells upon other DNA repair pathways, including those that require PARP-1. When PARP-1 is inhibited in BRCA-deficient tumor cells, certain types of DNA damage cannot be repaired, causing the cells to die by apoptosis. This type of treatment is based on a strategy of “synthetic lethality,” which suggests that mutations or inhibition of only one protein (e.g., BRCA or PARP) has no effect on cell viability but that mutations and/or inhibition of two different proteins leaves the cell unable to carry out one or more essential functions. More importantly, this approach provides a general strategy to target cancer cells that have lost the function of a particular tumor suppressor protein, in this case BRCA1 or BRCA2. A similar strategy might also be devised to target cancer cells that have lost p53 function, for example, because such cells should also be vulnerable to drugs that have no effect on normal cells, which possess intact tumor-suppressive pathways. It might be possible to find genes that might be sensitive to synthetic lethality by searching the data from cancer genomics studies, looking for either 1) two genes that are never mutated together in the same tumor or 2) genes that are never mutated in any cases of a particular cancer. Genes in this latter category are presumably required for survival of the cancer cells and represent possible drug targets. The Concept of a Cancer Stem Cell Another reason for the failure to develop more effective targeted therapies may be that the agents are not targeting the appropriate cells within the tumor. This possibility requires further explanation but raises an important issue concerning both the basic biology of cancer and its treatment. Throughout this chapter, we have considered a tumor to be a mass of relatively homogeneous cells. When viewed in this way, all of the cells in a tumor are capable of unlimited proliferation and all of the cells have the opportunity to evolve into a more malignant phenotype as the result of ongoing genetic change. In recent years, a new concept has emerged, which suggests that while most of the cells of a tumor may be dividing at a rapid rate, they have relatively limited long-term potential to sustain the primary tumor or initiate a new secondary tumor. Instead, a relatively small fraction of cells scattered throughout the tumor are responsible

for maintaining the tumor and promoting its spread. These “special” cells are known as cancer stem cells6, and there is considerable experimental evidence for their existence in leukemias, brain tumors, breast tumors, and other cancers. In most cases, the evidence for the existence of cancer stem cells rests on the finding that only a subpopulation of cells in a given human tumor is able to initiate the formation of a new tumor following injection into a susceptible (immunodeficient) mouse. Those cells that are able to regrow the tumor in the inoculated animal often possess certain surface proteins that distinguish them from the cells that make up the bulk of the tumor, and it is these markers that serve to define the cancer stem cell. The term cancer stem cell does not suggest anything about the origin of these cells, whether from a tissue stem cell or some other type. Nor does it imply that cancer stem cells necessarily resemble normal tissue stem cells. Instead, they are defined simply as the cells capable of propagating or regrowing the tumor. The concept of the cancer stem cell is raised at this point in the chapter because it has important consequences for cancer therapy. If it is true that only a small subpopulation of cells in a tumor has the capability of continuing the life of that tumor, then developing drugs that rapidly kill off the bulk of a tumor mass but spare the cancer stem cells is doomed ultimately to fail. This is presumably the reason that patients with CML must continue to take Gleevec to keep their disease in remission; Gleevec does not seem to kill the leukemia initiating cells responsible for the recurrence of disease. While there are efforts underway to identify cancer stem cells in various types of tumors and learn more about their properties, these new views are just beginning to have an impact on drug development.

Inhibiting the Formation of New Blood Vessels (Angiogenesis) As a tumor grows in size, it stimulates the formation of new blood vessels, a process termed angiogenesis (Figure 16.24). Blood vessels are required to deliver nutrients and oxygen to the rapidly growing tumor cells and to remove their waste products. Blood vessels also provide the conduits for cancer cells to spread to other sites in the body. In 1971, Judah Folkman of Harvard University suggested that solid tumors might be destroyed by inhibiting their ability to form new blood vessels. After a quarter of a century of relative obscurity, this idea has been translated into an approved anticancer strategy. 6

It should be kept in mind that a number of experiments argue against the existence of rare cancer stem cells, and the issue is the focus of current debate. Cancer stem cells are essentially defined by a single assay: injection of cells into immunodeficient mice. The primary question in this debate is how large a fraction of cells in a tumor are capable of regrowing the tumor in this assay. A number of studies suggests that the answer to this question depends on the degree to which the animal that receives the tumor cells is immunocompromised rather than the nature of the cells that are injected. There is also considerable debate over the degree to which cancer stem cells can be defined by novel cell-surface markers. See Science 324:1670, 2009; Cell 138:822, 2009; and Nature Med. 17:313, 2011 for a discussion of these issues.

693

Cancer cells promote angiogenesis by secreting growth factors, such as VEGF, that act on the endothelial cells of surrounding blood vessels, stimulating them to proliferate and develop into new vessels. A variety of angiogenic inhibitors have been developed by biotechnology companies, including antibodies and synthetic compounds directed against integrins, growth factors, and growth-factor receptors. Preclinical studies on mice and rats suggested that angiogenic inhibitors might be effective in stopping tumor growth. Most importantly, tumors treated with these inhibitors in preclinical studies did not become resistant to repeated drug application. Tumor cells become resistant to the usual chemotherapeutic agents because the cells are genetically unstable and can evolve into resistant forms. Inhibitors of angiogenesis, however, target normal, genetically stable endothelial cells, which continue to respond to the presence of these agents. Inhibiting angiogenesis in human tumors is not as easily accomplished as might have been expected based on studies with mice. To date, the most promising results have been obtained with a humanized antibody (called Avastin) that is directed against VEGF, the endothelial cell growth factor that is overexpressed in most solid tumors. Avastin blocks VEGF from binding to and activating its receptor, VEGFR. FDA approval was based on clinical trials demonstrating that Avastin, in combination with standard chemotherapy, could prolong the lives of patients with metastatic colorectal cancer for several months. Subsequent studies suggested that Avastin could add several months to the lives of patients with several other types of solid tumors, but this conclusion has been debated, most notably in cases of breast cancer. Whether or not

inhibition of angiogenesis will ever prove itself as an effective therapy remains to be seen but, at the present time, interest in this therapeutic strategy has waned. For the forseeable future, the best anticancer strategy is early detection. The time that elapses between the initiation of one of the common epithelial cancers and its progression to a deadly metastatic disease is generally considered to be at least a decade and often much longer. Consequently, there is ample time to detect these cancers before they become lifethreatening. A number of screening procedures are currently in use, including mammography for detecting breast cancer, Pap smears for detecting cervical cancer, PSA determinations for detecting prostate cancer, and colonoscopy for detecting colorectal cancer. It is hoped that advances in proteomics will evenually lead to the development of new screening tests based on the relative levels of various proteins present in blood or other bodily fluids. This approach was discussed in some detail on page 72. There are other blood-borne biological indicators (biomarkers) that could reveal the presence of cancer, including mutant DNA, particular miRNAs, abnormal carbohydrates, and distinctive metabolites. Screening tests based on each of these types of biomarkers are being investigated.7 7

There is a widely-held view that increased early screening procedures are likely to identify an increased number of cancers that will never become life-threatening and, thus, will increase the cost and debilitating side-effects of treatments that do not affect survival rates (discussed in Ann Rev. Med. 60:125, 2009) The most compelling evidence for the value of such procedures comes from a recent study that found that removal of polyps during a colonoscopy greatly reduces the likelihood of death from colon cancer (NEJM 366:687, 2012).

Primary tumor

Basement membrane

1

Figure 16.24 Angiogenesis and tumor growth. Steps in the vascularization of a primary tumor. In step 1, the tumor proliferates to form a small mass of cells. As long as it is avascular (without blood vessels), the tumor remains very small (1–2 mm). In step 2, the tumor mass has produced angiogenic factors that stimulate the endothelial cells of nearby vessels to grow out toward the tumor cells. In step 3, the tumor

2

3

has become vascularized and is now capable of unlimited growth. (AFTER B. R. ZETTER, WITH PERMISSION FROM THE ANNUAL REVIEW OF MEDICINE, VOL 49; COPYRIGHT 1998. ANNUAL REVIEW OF MEDICINE BY ANNUAL REVIEWS. REPRODUCED WITH PERMISSION OF ANNUAL REVIEWS IN THE FORMAT REPUBLISH IN A BOOK VIA COPYRIGHT CLEARANCE CENTER.)

16.4 New Strategies for Combating Cancer

Blood vessel

694

E X P E R I M E N TA L

P AT H W AY S

The Discovery of Oncogenes In 1911, Peyton Rous of The Rockefeller Institute for Medical Research published a paper that was less than one page in length (it shared the page with a note on the treatment of syphilis) and had virtually no impact on the scientific community. Yet this paper reported one of the most farsighted observations in the field of cell and molecular biology.1 Rous had been working with a chicken sarcoma that could be propagated from one hen to another by inoculating a host of the same strain with pieces of tumor tissue. In this paper, Rous described a series of experiments that strongly suggested that the tumor could be transmitted from one animal to another by a “filterable virus,” which is a term that had been coined a decade or so earlier to describe pathogenic agents that were small enough to pass through filters that were impermeable to bacteria. In his experiments, Rous removed the tumors from the breasts of hens, ground the cells in a mortar with sterile sand, centrifuged the particulate material into a pellet, removed the supernatant, and forced the supernatant fluid through filters of varying porosity including those small enough to prevent the passage of bacteria. He then injected the filtrate into the breast muscle of a recipient hen and found that a significant percentage of the injected animals developed the tumor. The virus discovered by Rous in 1911 is an RNA-containing virus. By the end of the 1960s, similar viruses were found to be associated with mammary tumors and leukemias in rodents and cats. Certain strains of mice had been bred that developed specific tumors with very high frequency. RNA-containing viral particles could be seen within the tumor cells and also budding from the cell surface, as

shown in the micrograph of Figure 1. It was apparent that the gene(s) causing tumors in these inbred strains are transmitted vertically, that is, through the fertilized egg from mother to offspring, so that the adults of each generation invariably develop the tumor. These studies provided evidence that the viral genome can be inherited through the gametes and subsequently transmitted from cell to cell by means of mitosis without having any obvious effect on the behavior of the cells. The presence of inherited viral genomes is not a peculiarity of inbred laboratory strains, because it was shown that wild (feral) mice treated with chemical carcinogens develop tumors that often contain the same antigens characteristic of RNA tumor viruses and that exhibit virus particles under the electron microscope. One of the major questions concerning the vertical transmission of RNA tumor viruses was whether the viral genome is passed from parents to progeny as free RNA molecules or is somehow integrated into the DNA of the host cell. Evidence indicated that infection and transformation by these viruses required the synthesis of DNA. Howard Temin of the University of Wisconsin suggested that the replication of RNA tumor viruses occurs by means of a DNA intermediate—a provirus—which then serves as a template for the synthesis of viral RNA. But this model requires a unique enzyme—an RNAdependent DNA polymerase—which had never been found in any type of cell. Then in 1970, an enzyme having this activity was discovered independently by David Baltimore of the Massachusetts Institute of Technology and by Temin and Satoshi Mizutani.2,3 Baltimore examined the virions (the mature viral particles) from two RNA tumor viruses, Rauscher mouse leukemia virus (R-MLV) and Rous sarcoma virus (RSV). A preparation of purified virus was incubated under conditions that would promote the activity of a DNA polymerase, including Mg2⫹ (or Mn2⫹), NaCl, dithiothreitol (which prevents the —SH groups of the enzyme from becoming oxidized), and all four deoxyribonucleoside triphosphates, one of which (TTP) was radioactively labeled. Under these conditions, the preparation incorporated the labeled DNA precursor into an acidinsoluble product that exhibited the properties of DNA (Table 1). As is characteristic of DNA, the reaction product was rendered acid soluble (indicating that it had been converted to low-molecularweight products) by treatment with pancreatic deoxyribonuclease or micrococcal nuclease, but it was unaffected by pancreatic ribonuclease or by alkaline hydrolysis (to which RNA is sensitive; Table 1). The DNA-polymerizing enzyme was found to co-sediment with the

Table 1 Characterization of the Polymerase Product

Expt.

Chapter 16 Cancer

1

2 Figure 1 Electron micrograph of a Friend mouse leukemia virus budding from the surface of a cultured leukemic cell. (COURTESY OF E. DE HARVEN.)

Treatment

Untreated 20 ␮g deoxyribonuclease 20 ␮g micrococcal nuclease 20 ␮g ribonuclease Untreated NaOH hydrolyzed

Acidinsoluble radioactivity

Percentage undigested product

1,425 125 69 1,361 1,644 1,684

(100) 9 5 96 (100) 100

Source: From D. Baltimore, Nature 226:1210, 1970, copyright 1970. Nature by Nature Publishing Group. Reproduced with permission of Nature Publishing Group in the format reuse in a book/textbook via Copyright Clearance Center.

695

1

20

_

H-TMP incorporated (c.p.m. x 10 2)

25

2

15

3

10

3

5

4

30

60 Min

90

120

Figure 2 Incorporation of radioactivity from [3H]TTP into an acidinsoluble precipitate by the Rauscher murine leukemia virus DNA polymerase in the presence and absence of ribonuclease. (Note: the labeled TTP precursor is converted to TMP as it is incorporated into DNA.) Curve 1, no added ribonuclease; curve 2, preincubated without added ribonuclease for 20 minutes before addition of [3H]TTP; curve 3, ribonuclease added to the reaction mixture; curve 4, preincubated with ribonuclease before addition of [3H]TTP. (FROM D. BALTIMORE, NATURE 226:1210, 1970, COPYRIGHT 1970. NATURE BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

16.4 New Strategies for Combating Cancer

mature virus particles, suggesting that it was part of the virion itself and not an enzyme donated by the host cell. Although the product was insensitive to treatment with pancreatic ribonuclease, the template was very sensitive to this enzyme (Figure 2), particularly if the virions were pretreated with the ribonuclease prior to addition of the other components of the reaction mixture (Figure 2, curve 4). These results strengthened the suggestion that the viral RNA was providing the template for synthesis of a DNA copy, which presumably served as a template for the synthesis of viral mRNAs required for infection and transformation. Not only did these experiments suggest that cellular transformation by RNA tumor viruses proceeded through a DNA intermediate, they also overturned the long-standing concept originally proposed by Francis Crick and known as the Central Dogma, which stated that information in a cell always flowed from DNA to RNA to protein. The RNA-dependent DNA polymerase became known as reverse transcriptase. During the 1970s, attention turned to the identification of the genes carried by tumor viruses that were responsible for transformation and the mechanism of action of the gene products. Evidence from genetic analyses indicated that mutant strains of viruses could be isolated that retained the ability to grow in host cells, but were unable to transform the cell into one exhibiting malignant properties.4 Thus, the capacity to transform a cell resided in a restricted portion of the viral genome. These findings set the stage for a series of papers by Harold Varmus, J. Michael Bishop, Dominique Stehelin, and their co-workers

at the University of California, San Francisco. These researchers began by isolating mutant strains of the avian sarcoma virus (ASV) carrying deletions of 10 to 20 percent of the genome that render the virus unable to induce sarcomas in chickens or to transform fibroblasts in culture. The gene responsible for transformation, which is missing in these mutants, was referred to as src (for sarcoma). To isolate the DNA corresponding to the deleted regions of these mutants, which presumably carry the genes required for transformation, the following experimental strategy was adopted.5 RNA from the genomes of complete (oncogenic) virions was used as a template for the formation of a radioactively labeled, single-stranded, complementary DNA (cDNA) using reverse transcriptase. The labeled cDNA (which is present as fragments) was then hybridized to RNA obtained from one of the deletion mutants. Those DNA fragments that failed to hybridize to the RNA represent portions of the genome that had been deleted from the transformation-defective mutant and thus were presumed to contain the gene required by the virus to cause transformation. DNA fragments that did not hybridize to the RNA were separated from those that were part of DNA–RNA hybrids by column chromatography. Using this basic strategy, a DNA sequence referred to as cDNAsarc was isolated, which corresponded to approximately 16 percent of the viral genome (1600 nucleotides out of a total genomic length of 10,000 nucleotides). Once isolated, cDNAsarc proved to be a very useful probe. It was first shown that this labeled cDNA hybridizes to DNA extracted from cells of a variety of avian species (chicken, turkey, quail, duck, and emu), indicating that the cellular genomes of these birds contain a DNA sequence that is closely related to src.6 These findings provided the first strong evidence that a gene carried by a tumor virus that causes cell transformation is actually present in the DNA of normal (uninfected) cells and thus is presumably a part of the cells’ normal genome. These results indicated that the transforming genes of the viral genome (the oncogenes) are not true viral genes, but rather are cellular genes that were picked up by RNA tumor viruses during a previous infection. Possession of this cell-derived gene apparently endows the virus with the power to transform the very same cells in which this gene is normally found. The fact that the src sequence is present in all of the avian species tested suggests that the sequence has been conserved during avian evolution and, thus, is presumed to govern a basic activity of normal cells. In a subsequent study, it was found that cDNAsarc binds to DNA from all vertebrate classes, including mammals, but not to the DNA from sea urchins, fruit flies, or bacteria. Based on these results it was possible to conclude that the src gene is not only present in the RNA of the ASV genome and the genome of the chicken cells it can infect, but a homologous gene is also present in the DNA of distantly related vertebrates, suggesting that it plays some critical function in the cells of all vertebrates.7 These findings raised numerous questions; foremost among these were (1) what is the function of the src gene product, and (2) how does the presence of the viral src gene (referred to as v-src) alter the behavior of a normal cell that already possesses a copy of the cellular gene (referred to as c-src)? The product of the src gene was initially identified by Ray Erikson and co-workers at the University of Colorado by two independent procedures: (1) precipitation of the protein from extracts of transformed cells by antibodies prepared from RSV-infected animals, and (2) synthesis of the protein in a cell-free proteinsynthesizing system using the isolated viral gene as a template. Using these procedures, the src gene product was found to be a protein of 60,000 daltons, which they named pp60src.8 When pp60src was incubated with [32P]ATP, radioactive phosphate groups were transferred to the heavy chains of the associated antibody (IgG)

696

Chapter 16 Cancer

molecules used in the immunoprecipitation. This finding suggested that the src gene codes for an enzyme that possesses protein kinase activity.9 When cells infected with ASV were fixed, sectioned, and incubated with ferritin-labeled antibodies against pp60src, the antibodies were found to be localized on the inner surface of the plasma membrane, suggesting a concentration of the src gene product in this part of the cell (Figure 3).10 These were the first studies to elucidate the function of an oncogene. A protein kinase is the type of gene product that might be expected to have potential transforming activity, because it can regu-

Figure 3 Electron micrograph of a section through a pair of adjacent fibroblasts that had been treated with ferritin-labeled antibodies against the pp60src protein. The protein is localized (as revealed by the dense ferritin granules) at the plasma membrane of the cell and is particularly concentrated at the sites of gap junctions. (FROM MARK C. WILLINGHAM, GILBERT JAY, AND IRA PASTAN, CELL 18:128, 1979, WITH PERMISSION FROM ELSEVIER.)

late the activities of numerous other proteins, each of which might serve a critical function in one or another activity related to cell growth. Further analysis of the role of the src gene product turned up an unexpected finding. Unlike all the other protein kinases whose function had been studied, pp60src transferred phosphate groups to tyrosine residues on the substrate protein rather than to serine or threonine residues.11 The existence of phosphorylated tyrosine residues had escaped previous detection because phosphorylated serine and threonine residues are approximately 3000 times more abundant in cells than phosphotyrosine, and because phosphothreonine and phosphotyrosine residues are difficult to separate from one another by traditional electrophoretic procedures. Not only did the product of the viral src gene (v-src) code for a tyrosine protein kinase, so too did c-src, the cellular version of the gene. However, the number of phosphorylated tyrosine residues in proteins of RSV-transformed cells was approximately eight times higher than that of control cells. This finding suggested that the viral version of the gene may induce transformation because it functions at a higher level of activity than the cellular version. The results from the study of RSV provided preliminary evidence that an increased activity of an oncogene product could be a key to converting a normal cell into a malignant cell. Evidence soon became available that the malignant phenotype could also be induced by an oncogene that contained an altered nucleotide sequence. A key initial study was conducted by Robert Weinberg and his colleagues at the Massachusetts Institute of Technology using the technique of DNA transfection.12 Weinberg began the studies by obtaining 15 different malignant cell lines that were derived from mouse cells that had been treated with a carcinogenic chemical. Thus these cells had been made malignant without exposing them to viruses. The DNA from each of these cell lines was extracted and used to transfect a type of nonmalignant mouse fibroblast called an NIH3T3 cell. NIH3T3 cells were selected for these experiments because they take up exogenous DNA with high efficiency and they are readily transformed into malignant cells in culture. After transfection with DNA from the tumor cells, the fibroblasts were grown in vitro, and the cultures were screened for the formation of clumps (foci) that contained cells that had been transformed by the added DNA. Of the 15 cell lines tested, five of them yielded DNA that could transform the recipient NIH3T3 cells. DNA from normal cells lacked this capability. These results demonstrated that carcinogenic chemicals produced alterations in the nucleotide sequences of genes that gave the altered genes the ability to transform other cells. Thus cellular genes could be converted into oncogenes in two different ways: as the result of becoming incorporated into the genome of a virus or by becoming altered by carcinogenic chemicals. Up to this point, virtually all of the studies on cancer-causing genes had been conducted in mice, chickens, or other organisms whose cells were highly susceptible to transformation. In 1981, attention turned to human cancer when it was shown that DNA isolated from human tumor cells can transform mouse NIH3T3 cells following transfection.13 Of 26 different human tumors that were tested in this study, two provided DNA that was capable of transforming mouse fibroblasts. In both cases, the DNA had been extracted from cell lines taken from a bladder carcinoma (identified as EJ and J82). Extensive efforts were undertaken to determine if the genes had been derived from a tumor virus, but no evidence of viral DNA was detected in these cells. These results provided the first evidence that some human cancer cells contain an activated oncogene that can be transmitted to other cells, causing their transformation. The finding that cancer can be transmitted from one cell to another by DNA fragments provided a basis for determining which genes in a cell, when activated by mutation or some other mecha-

697 viral genome. Thus tumor viruses, which are not themselves directly involved in most human cancers, have provided the necessary window through which we can view our own genetic inheritance for the presence of information that can lead to our own undoing.

References 1. ROUS, P. 1911. Transmission of a malignant new growth by means of a cellfree filtrate. J. Am. Med. Assoc. 56:198. 2. BALTIMORE, D. 1970. RNA-dependent DNA polymerase in virions of RNA tumour viruses. Nature 226:1209–1211. 3. TEMIN, H. & MIZUTANI, S. 1970. RNA-dependent DNA polymerase in virions of Rous sarcoma virus. Nature 226:1211–1213. 4. MARTIN, G. S. 1970. Rous sarcoma virus: A function required for the maintenance of the transformed state. Nature 227:1021–1023. 5. STEHELIN, D. ET AL. 1976. Purification of DNA complementary to nucleotide sequences required for neoplastic transformation of fibroblasts by avian sarcoma viruses. J. Mol. Biol. 101:349–365. 6. STEHELIN, D. ET AL. 1976. DNA related to the transforming gene(s) of avian sarcoma viruses is present in normal avian DNA. Nature 260:170–173. 7. SPECTOR, D. H., VARMUS, H. E., & BISHOP, J. M. 1978. Nucleotide sequences related to the transforming gene of avian sarcoma virus are present in DNA of uninfected vertebrates. Proc. Nat’l. Acad. Sci. U.S.A. 75:4102– 4106. 8. PURCHIO, A. F. ET AL. 1978. Identification of a polypeptide encoded by the avian sarcoma virus src gene. Proc. Nat’l. Acad. Sci. U.S.A. 75:1567–1671. 9. COLLETT, M. S. & ERIKSON, R. L. 1978. Protein kinase activity associated with the avian sarcoma virus src gene product. Proc. Nat’l. Acad. Sci. U.S.A. 75:2021–2924. 10. WILLINGHAM, M. C., JAY, G., & PASTAN, I. 1979. Localization of the ASV src gene product to the plasma membrane of transformed cells by electron microscopic immunocytochemistry. Cell 18:125–134. 11. HUNTER, T. & SEFTON, B. M. 1980. Transforming gene product of Rous sarcoma virus phosphorylates tyrosine. Proc. Nat’l. Acad. Sci. U.S.A. 77:1311–1315. 12. SHIH, C. ET AL. 1979. Passage of phenotypes of chemically transformed cells via transfection of DNA and chromatin. Proc. Nat’l. Acad. Sci. U.S.A. 76:5714–5718. 13. KRONTIRIS, T. G. & COOPER, G. M. 1981. Transforming activity of human tumor DNAs. Proc. Nat’l. Acad. Sci. U.S.A. 78:1181–1184. 14. GOLDFARB, M. ET AL. 1982. Isolation and preliminary characterization of a human transforming gene from T24 bladder carcinoma cells. Nature 296:404–409. 15. SHIH, C. & WEINBERG, R. A. 1982. Isolation of a transforming sequence from a human bladder carcinoma cell line. Cell 29:161–169. 16. PULCIANI, S. ET AL. 1982. Oncogenes in human tumor cell lines: Molecular cloning of a transforming gene from human bladder carcinoma cells. Proc. Nat’l. Acad. Sci. U.S.A. 79:2845–2849. 17. PARADA, L. F. ET AL. 1982. Human EJ bladder carcinoma oncogene is a homologue of Harvey sarcoma virus ras gene. Nature 297:474–478. 18. DER, C. J. ET AL. 1982. Transforming genes of human bladder and lung carcinoma cell lines are homologous to the ras genes of Harvey and Kirsten sarcoma viruses. Proc. Nat’l. Acad. Sci. U.S.A. 79:3637–3640. 19. SANTOS, E. ET AL. 1982. T24 human bladder carcinoma oncogene is an activated form of the normal human homologue of BALB- and Harvey-MSV transforming genes. Nature 298:343–347. 20. TABIN, C. J. ET AL. 1982. Mechanism of activation of a human oncogene. Nature 300:143–149. 21. REDDY, E. P. ET AL. 1982. A point mutation is responsible for the acquisition of transforming properties by the T24 human bladder carcinoma oncogene. Nature 300:149–152. 22. TAPAROWSKY, E. ET AL. 1982. Activation of the T24 bladder carcinoma transforming gene is linked to a single amino acid change. Nature 300: 762–765.

16.4 New Strategies for Combating Cancer

nism, are responsible for causing the cell to become malignant. To make this determination, it was necessary to isolate the DNA that was taken up by cells, causing their transformation. Once the foreign DNA responsible for transformation had been isolated, it could be analyzed for the presence of cancer-causing alleles. Within two months of one another in 1982, three different laboratories reported the isolation and cloning of an unidentified gene from human bladder carcinoma cells that can transform mouse NIH3T3 fibroblasts.14–16 Once the transforming gene from human bladder cancer cells had been isolated and cloned, the next step was to determine if this gene bears any relationship to the oncogenes carried by RNA tumor viruses. Once again, within two months of one another, three papers appeared from different laboratories reporting similar results.17–19 All three showed that the oncogene from human bladder carcinomas that transforms NIH3T3 cells is the same oncogene (named ras) that is carried by the Harvey sarcoma virus, which is a rat RNA tumor virus. Preliminary comparisons of the two versions of ras—the viral version and its cellular homologue—failed to show any differences, indicating that the two genes are either very similar or identical. These findings suggested that cancers that develop spontaneously in the human population are caused by a genetic alteration that is similar to the changes in cells that have been virally transformed in the laboratory. It is important to note that the types of cancers induced by the Harvey sarcoma virus (sarcomas and erythroleukemias) are quite different from the bladder tumors, which have an epithelial origin. This was the first indication that alterations in the same human gene—RAS— can cause a wide range of different tumors. By the end of 1982, three more papers from different laboratories reported on the precise changes in the human RAS gene that leads to its activation as an oncogene.20–22 Once the section of the large DNA fragment that is responsible for causing transformation was pinned down, nucleotide sequence analysis indicated that the DNA from the malignant bladder cells is activated as a result of a single base substitution within the coding region of the gene. Remarkably, cells of both human bladder carcinomas studied (identified as EJ and T24) contain DNA with precisely the same alteration: a guanine-containing nucleotide at a specific site in the DNA of the proto-oncogene had been converted to a thymidine in the activated oncogene. This base substitution results in the replacement of a valine for a glycine as the twelfth amino acid residue of the polypeptide. Determination of the nucleotide sequence of the v-ras gene carried by the Harvey sarcoma virus revealed an alteration in base sequence that affected precisely the same codon found to be modified in the DNA of the human bladder carcinomas. The change in the viral gene substitutes an arginine for the normal glycine. It was apparent that this particular glycine residue plays a critical role in the structure and function of this protein. It is interesting to note that the human RAS gene is a proto-oncogene that, like SRC, can be activated by linkage to a viral promoter. Thus RAS can be activated to induce transformation by two totally different pathways: either by increasing its expression or by altering the amino acid sequence of its encoded polypeptide. The research described in this Experimental Pathway provided a great leap forward in our understanding of the genetic basis of malignant transformation. Much of the initial research on RNA tumor viruses stemmed from the belief that these agents may be an important causal agent in the development of human cancer. The search for viruses as a cause of cancer led to the discovery of the oncogene, which led to the realization that the oncogene is a cellular sequence that is acquired by the virus, which ultimately led to the discovery that an oncogene can cause cancer without the involvement of a

698

Chapter 16 Cancer

| Synopsis Cancer is a disease involving heritable defects in cellular control mechanisms that result in the formation of invasive tumors capable of releasing cells that can spread the disease to distant sites in the body. Many of the characteristics of tumor cells can be observed in culture. Whereas normal cells proliferate until they form a single layer (monolayer) over the bottom of the dish, cancer cells continue to grow in culture, piling on top of one another to form clumps. Other characteristics often revealed by cancer cells include an abnormal number of chromosomes, an ability to continue to divide indefinitely, a dependence on glycolysis, and a lack of responsiveness to neighboring cells. (p. 665) Normal cells can be converted to cancer cells by treatment with a wide variety of chemicals, ionizing radiation, and a variety of DNA- and RNA-containing viruses; all of these agents act by causing changes in the genome of the transformed cell. Analysis of the cells of a cancerous tumor almost always shows that the cells have arisen by growth of a single cell (the tumor is said to be monoclonal). The development of a malignant tumor is a multistep process characterized by a progression of genetic alterations that make the cells increasingly less responsive to the body’s normal regulatory machinery and better able to invade normal tissues. The genes involved in carcinogenesis constitute a specific subset of the genome whose products are involved in such activities as control of the cell cycle, intercellular adhesion, and DNA repair. The level of expression of particular genes in different types of cancers can be determined using DNA microarrays. In addition to genetic alterations, growth of tumor cells is also influenced by both nongenetic and epigenetic influences that allow the cell to express its malignant phenotype. (p. 668) Genes that have been implicated in carcinogenesis are divided into two broad categories: tumor-suppressor genes and oncogenes. Tumor-suppressor genes encode proteins that restrain cell growth and prevent cells from becoming malignant. Tumor-suppressor genes act recessively since both copies must be deleted or mutated before their protective function is lost. Oncogenes, in contrast, encode proteins that promote the loss of growth control and malignancy. Oncogenes arise from proto-oncogenes—genes that encode proteins having a role in a cell’s normal activities. Mutations that alter either the protein or its expression cause the proto-oncogene to act abnormally and promote the formation of a tumor. Oncogenes act dominantly; that is, a single copy will cause the cell to express the altered phenotype. Most tumors contain alterations in both tumorsuppressor genes and oncogenes. As long as a cell retains at least one copy of all of its tumor-suppressor genes, it should be protected against the consequences of oncogene formation. Conversely, the loss of a tumor-suppressor function should not be sufficient, by itself, to cause the cell to become malignant. (p. 671) The first tumor-suppressor gene to be identified was RB, which is responsible for a rare retinal tumor called retinoblastoma, which occurs with high frequency in certain families but can also occur sporadically. Children with the familial form of the disease inherit one mutated copy of the gene. These individuals develop the cancer only after sporadic damage to the second allele in one of the retinal cells. RB encodes a protein called pRB, which is involved in regulating passage of a cell from G1 to S in the cell cycle. The unphosphorylated form of pRB interacts with certain transcription factors, preventing them from activating the genes required for certain S

phase activities. Once pRB has been phosphorylated, the protein releases its bound transcription factor, which can then activate gene expression, leading to the initiation of S phase. (p. 673) The tumor-suppressor gene most often implicated in human cancer is TP53, whose product (p53) may be able to suppress cancer formation by several different mechanisms. In one of its actions, p53 acts as a transcription factor that activates the expression of a protein (p21) that inhibits the cyclin-dependent kinase that moves a cell through the cell cycle. Damage to DNA triggers the phosphorylation and stabilization of p53, leading to the arrest of the cell cycle until the damage can be repaired. p53 can also redirect cells that are on the path toward malignancy onto an alternate path leading to either apoptosis or senescence. TP53 knockout mice begin to develop tumors several weeks after birth. Other tumor-suppressor genes include APC that, when mutated, predisposes an individual to developing colon cancer, and BRCA1 and BRCA2 that, when mutated, predispose an individual to developing breast cancer. (p. 675) Most of the known oncogenes are derived from proto-oncogenes that play a role in the pathways that transmit growth signals from the extracellular environment to the cell interior, particularly the cell nucleus. A number of oncogenes have been identified that encode growth factor receptors, including platelet-derived growth factor (PDGF) and epidermal growth factor (EGF) receptors. Malignant cells may contain a much larger number of one of these growth factor receptors in their plasma membranes when compared to normal cells. The excess receptors make the cells sensitive to lower concentrations of the growth factor and, thus, are stimulated to divide under conditions that would not affect normal cells. A number of cytoplasmic protein kinases, including both serine/threonine and tyrosine kinases, are included on the list of oncogenes. These include RAF, which encodes a protein kinase in the MAP kinase cascade. Mutations in RAS are among the most common oncogenes found in human cancers. As discussed in Chapter 15, Ras activates the protein kinase activity of Raf. If Raf remains in the activated state, it sends continual signals along the MAP kinase pathway, leading to the continual stimulation of cell proliferation. A number of oncogenes, such as MYC, encode proteins that act as transcription factors. Myc is normally one of the first proteins to appear when a cell is stimulated to reenter the cell cycle from the quiescent G0 phase. Another group of oncogenes, such as BCL-2, encode proteins involved in apoptosis. Overexpression of the BCL-2 gene leads to the suppression of apoptosis in lymphoid tissues, allowing abnormal cells to proliferate to form lymphoid tumors. Proteins that modify chromatin structure and certain metabolic enzymes (e.g., IDH) are also capable of serving as oncoproteins. (p. 679) Cancer is currently treated by surgery, chemotherapy, and radiation. Several other strategies are being tested; these include immunotherapy, inhibition of proteins encoded by oncogenes, and inhibition of angiogenesis. To date, the greatest success has come with the development of an inhibitor of ABL kinase in patients with chronic myelogenous leukemia. A second success story has been the development of humanized antibodies that bind to a protein on the surface of malignant B cells in cases of non-Hodgkin’s lymphoma. Antiangiogenic strategies attempt to prevent a solid tumor from inducing the formation of new blood vessels that are required to supply the tumor cells with nutrients and other materials. (p. 687)

699

17 The Immune Response 17.1 An Overview of the Immune Response 17.2 The Clonal Selection Theory as It Applies to B Cells 17.3 T Lymphocytes: Activation and Mechanism of Action 17.4 Selected Topics on the Cellular and Molecular Basis of Immunity THE HUMAN PERSPECTIVE: Autoimmune Diseases EXPERIMENTAL PATHWAYS: The Role of the Major Histocompatibility Complex in Antigen Presentation

L

iving organisms provide ideal habitats in which other organisms can grow. It is not surprising, therefore, that animals are subject to infection by viruses, bacteria, protists, fungi, and animal parasites. Vertebrates have evolved several mechanisms that allow them to recognize and destroy these infectious agents. As a result, vertebrates are able to develop an immunity against invading pathogens. Immunity results from the combined activities of many different cells, some of which patrol the body, whereas others are concentrated in lymphoid organs, such as the bone marrow, thymus, spleen, and lymph nodes (Figure 17.1). Together, these dispersed cells and discrete organs form the body’s immune system. The cells of the immune system engage in a type of molecular screening by which they recognize “foreign” macromolecules, that is, ones whose structure is different from those of the body’s normal macromolecules. If foreign material is encountered, the immune system mounts a specific and concerted attack against it. The weapons of the immune system include (1) cells that kill or ingest infected or altered cells and (2) soluble proteins that can

T lymphocytes are cells of the immune system that become activated when they contact cells (called APCs) displaying foreign peptides on their surface. This image shows a T cell that has undergone its first division after its association with an APC. The dividing cell has been stained for two different proteins, one appearing red and the other green. It is evident that the two proteins are being distributed into different daughter cells. The evidence from this study suggests that these two cells go on to have different fates, one becoming a short-lived effector cell that carries out a specific T-cell response and the other a memory cell that will remain dormant until some later date when it is reactivated by subsequent contact with antigen. On a more general note, this image demonstrates that not all cell divisions produce identical daughter cells and that the process of cell division provides a means by which cells with different fates can be generated. (FROM JOHN T. CHANG ET AL., COURTESY OF STEVEN L. REINER, SCIENCE 315:1690, 2007, WITH PERMISSION FROM AAAS.)

700 Primary lymphoid organs

Secondary lymphoid tissues Tonsil

Thymus (site of T-cell maturation)

Lymph nodes

Spleen Peyer’s patches of intestine Appendix Bone marrow (origin of B cells and T cells and site of B-cell maturation) Lymph vessels

17.1 | An Overview of the Immune Response The outer surface of the body and the linings of its internal tracts provide an excellent barrier to prevent penetration by viruses, bacteria, and parasites. If these surface barriers are breached, a series of immune responses are initiated that attempt to contain and then kill the invaders. Immune responses can be divided into two general categories: innate responses and adaptive (or acquired) responses. Both types of responses depend on the ability of the body to distinguish between materials that are supposed to be there (i.e., “self ”) and those that are not (i.e., foreign, or “nonself ”). We can also distinguish two categories of pathogens: those that occur primarily inside a host cell (all viruses, some bacteria, and certain protozoan parasites) and those that occur primarily in the extracellular compartments of the host (most bacteria and other cellular pathogens). Different types of immune mechanisms have evolved to combat these two types of infections. An overview of some of these mechanisms is shown in Figure 17.2.

Innate Immune Responses

Figure 17.1 The human immune system includes various lymphoid organs, such as the thymus, bone marrow, spleen, lymph nodes, and scattered cells located as patches within the small intestine, appendix, and tonsils. The thymus and bone marrow are often described as the central immune system because of their key roles in lymphocyte differentiation.

Chapter 17 The Immune Response

FROM WESSNER, MICROBIOLOGY, 1E, FIGURE 20.4. JOHN WILEY & SONS PUBLISHERS. REPRINTED BY PERMISSION OF JOHN WILEY & SONS, INC.

neutralize, immobilize, agglutinate, or kill pathogens. Pathogens, in turn, are continually evolving countermechanisms to avoid immune destruction. The fact that humans suffer from a number of chronic infective diseases, such as AIDS (caused by a virus), tuberculosis (caused by a bacterium), and malaria (caused by a protozoan) illustrates how our immune systems are not always successful in combating these microscopic pathogens. The immune system is also implicated in the body’s fight against cancer, but the degree to which the system can recognize and kill cancer cells remains controversial. In some cases, the immune system may mount an inappropriate response that attacks the body’s own tissues. As discussed in the Human Perspective on page 724, these incidents can lead to serious disease. It is impossible to cover the entire subject of immunity in a single chapter. Instead, we will focus on a number of selected aspects that illustrate principles of cell and molecular biology that were discussed in previous chapters. First, however, it is necessary to examine the basic events in the body’s response to the presence of an intruding microbe.

Innate immune responses are those the body mounts immediately without requiring previous contact with the microbe. Thus, they provide the body with a first line of defense. An invading pathogen typically makes its initial contact with the innate immune system when it is greeted by a phagocytic cell, such as a macrophage or a dendritic cell (see Figure 17.10), whose function is to recognize foreign objects and sound an appropriate alarm. In 1989, Charles Janeway, Jr. of Yale University published a far-sighted proposal to address the role of the innate immune system both in providing the body with immediate protection from pathogens and in stimulating the cells of the adaptive immune system to become subsequently involved in overcoming the threat. We will restrict the present discussion to the innate immune system. Janeway proposed that cells of the innate system possess a variety of microbial sensors that directly recognize certain highly conserved macromolecules that play essential roles in the propagation of viruses or bacteria but are not produced by cells of the body. Janeway called such sensors pattern recognition receptors (or PRRs). Over the past two decades, a number of families of PRRs have been identified, the most important of which are the Toll-like receptors (TLRs), which is the only group that will be discussed in this chapter. The discovery of TLRs came about by way of an interesting and unexpected series of events. The fruit fly Drosophila melanogaster is well known for its iconic status in the field of genetics, and also for key contributions to the study of development and neurobiology. But, as an invertebrate, Drosophila would not have been considered a likely organism for a major discovery concerning the workings of the human immune system. However, in 1996, Jules Hoffmann and colleagues in France identified a mutant fruit fly that was highly susceptible to fungal infections, as is evident from the photograph in Figure 17.3. The fly pictured in Figure 17.3 is lacking a protein called Toll, which had been previously

701 ADAPTIVE IMMUNITY

INNATE IMMUNITY

Bacterium

Phagocyte

(NONSPECIFIC)

B cell

(SPECIFIC)

Bacterium coated with antibody molecules Bacterium e

Antibody

a

Complement

Dendritic cell or macrophage

f

Antibody layer

Bacterial toxin

T cell

b

NK cell

Virus

d

c

Infected apoptotic cell

g

IFN-α

Virus

Infected cell

Cell resistant to viral infection

Infected apoptotic cell

(opsonized), which makes it susceptible to phagocytosis or complement-induced death; and ( g), apoptosis induced in an infected cell by an activated T lymphocyte (T cell). Innate and adaptive immune responses are linked to one another (horizontal green arrow) because cells, such as dendritic cells and macrophages, that phagocytize pathogens, use the foreign proteins to stimulate the production of specific antibodies and T cells directed against the pathogen. NK cells also produce substances (e.g., IFN␥) that influence a T-cell response.

identified as a protein required for the normal development of dorsoventral (“top-bottom”) polarity of the fly embryo. In fact, “Toll” is the German word for “weird,” which described the

jumbled appearance of fly embryos lacking a functional Toll gene. In addition they found that Toll acted in flies through a pathway involving the transcription factor NF-␬B, which was known to be a key player in the activation of an immune response in vertebrates. It appeared that Toll serves a dual function in flies, as both a director of embryonic polarity and a factor that promotes innate immunity to infection. The discovery of the Toll gene as an important contributor to the fruit fly’s defense system led Janeway and colleague Ruslan Medzhitov to clone and characterize a human homologue of the Toll protein. In addition, they found that the human Toll homologue also acted by way of the NF-␬B pathway, inducing the expression of a number of immunological effector proteins (called cytokines). It was not known, however, what specific role this “human Toll” played in the immune response. Meanwhile, Bruce Beutler and his colleagues at the University of Texas, Southwestern Medical Center were searching for mammalian genes that were involved in recognizing bacterial components, particularly a component of the outer membrane of gram-negative bacteria, called lipopolysaccaride (LPS). They found that a mouse strain that was unable to respond to LPS was lacking a particular gene that encoded an LPS receptor. Strikingly, this LPS receptor was the same Toll homologue (TLR4) that had been characterized by Janeway and Medzhitov, thus defining its role as a specific bacterial sensor. As it happens, humans express at least ten functional TLRs, all of which are transmembrane proteins

Figure 17.3 Scanning electron micrograph of a mutant fruit fly that had died from a fungal infection. The body is covered with germinating fungal hyphae. This individual was susceptible to infection because it lacked a functional Toll gene. (FROM BRUNO LEMAITRE ET AL., CELL 86:978, 1996, WITH PERMISSION FROM ELSEVIER. COURTESY OF JULES A. HOFFMANN.)

17.1 An Overview of the Immune Response

Figure 17.2 An overview of some of the mechanisms by which the immune system rids the body of invading pathogens. The left panel depicts several types of innate immunity: (a), phagocytosis of a bacterial cell; (b), bacterial cell killing by complement; (c), apoptosis induced in an infected cell by a natural killer (NK) cell; and (d ), induction of viral resistance by interferon ␣ (IFN-␣). The right panel depicts several types of adaptive immunity: (e), B cell producing antibodies that neutralize a bacterial toxin; ( f ), bacterial cell coated with antibody

702

that are found in a variety of different types of cells. TLRs may be present either on the cell surface (where they interact with extracellular microbes) or within endosomal/lysosomal membranes (where they interact with endocytosed microbes). Within the human TLR family are receptors that recognize the lipopolysaccharide or peptidoglycan components of the bacterial cell wall, the protein flagellin found in bacterial flagella, double-stranded RNA characteristic of replicating viruses, and unmethylated CpG dinucleotides (which are characteristic of bacterial DNA). A model of a TLR bound to its double-stranded RNA ligand is shown in Figure 17.4. Activation of a TLR by one of these pathogen-derived molecules initiates a signal cascade within the cell that can lead to a variety of protective immune responses including the activation of cells of the adaptive immune system (represented by the horizontal green arrow in Figure 17.2). For this reason, a number of pharmaceutical companies are working on drugs that stimulate TLRs with the aim of enhancing the body’s response against stubborn infections, such as that caused by the hepatitis C virus. Aldara, which was approved in 1997 and is prescribed for a number of skin conditions including genital warts, was later discovered to act by stimulating a TLR. Innate responses to invading pathogens are typically accompanied by a process of inflammation at the site of infec-

tion where certain cells and plasma proteins leave the blood vessels and enter the affected tissues (page 255). These events are accompanied by local redness, swelling, and fever. Inflammation provides a means for concentrating the body’s defensive agents at the site where they are needed. During inflammation, phagocytic cells migrate toward a site of infection in response to chemicals (chemoattractants) released at the site (page 377). Once there, these cells recognize, engulf, and destroy the pathogen (Figure 17.2a). Inflammation is a dual-edged sword. Although it protects the body against an invading pathogen, if inflammation is not terminated in a timely manner it can lead to damage to the body’s normal tissues and chronic disease. Regulation of inflammation is a complex and poorly understood process involving a balance between pro- and anti-inflammatory activities. A number of other mechanisms are also geared to attacking extracellular pathogens. Both epithelial cells and lymphocytes secrete a variety of antimicrobial peptides, called defensins, which are able to bind to viruses, bacteria, or fungi and bring about their demise. Blood also contains a group of soluble proteins called complement that binds to pathogens, triggering their destruction. In one of the complement pathways, an activated assembly of these proteins perforates the plasma membrane of a bacterial cell, leading to cell lysis and death (Figure 17.2b). Innate responses against intracellular pathogens, such as viruses, are targeted primarily against cells that are already infected. Cells infected with certain viruses are recognized by a type of nonspecific lymphocyte called a natural killer (NK) cell. As its name implies, NK cells act by causing the death of the infected cell (Figure 17.2c). The NK cell induces the infected cell to undergo apoptosis (page 656). NK cells can also kill certain types of cancer cells in vitro (Figure 17.5) and may provide a mechanism for destroying such cells in vivo before they develop into a tumor. Normal (i.e., noninfected and non-

Chapter 17 The Immune Response

NK cell

Cancer C ancer cell Figure 17.4 Model of a Toll-like receptor (TLR3) bound to a double-stranded RNA (dsRNA) molecule. The extracellular ligandbinding domains of TLRs contain a large curved surface composed largely of leucine-rich repeats that provide flexibility in ligand binding. The dsRNA ligand (blue and orange double helix) binds to a TLR dimer whose transmembrane and cytoplasmic domains are brought into close contact as shown in this model. The complex is “ready” to send an intracellular signal alerting the cell of the presence of the foreign nucleic acid molecule. (FROM LIN LIU ET AL., COURTESY OF DAVID R. DAVIES, SCIENCE 320:381, 2008; © 2008. REPRINTED WITH PERMISSION FROM AAAS.)

Figure 17.5 Innate immunity. Scanning electron micrograph of a natural killer cell bound to a target cell, in this case, a malignant erythroleukemia cell. NK cells kill their targets by a similar mechanism as described for CTLs on page 709. (FROM BLOOD CELLS 17:165, 1991, WITH PERMISSION FROM ELSEVIER. COURTESY OF GIUSEPPE ARANCIA, DEPT. OF ULTRASTRUCTURES, INSTITUTO SUPERIORE DI SANITÁ, ROME.)

703

malignant) cells possess surface molecules that protect them from attack by NK cells. Another type of innate antiviral response is initiated within the infected cell itself. Virus-infected cells produce proteins called type 1 interferons (IFN-␣ and IFN-␤) that are secreted into the extracellular space, where they bind to the surface of noninfected cells rendering them resistant to subsequent infection (Figure 17.2d). Interferons accomplish this by several means, including activation of a signal transduction pathway that results in phosphorylation and consequent inactivation of the translation factor eIF2 (page 469). Cells that have undergone this response cannot synthesize viral proteins that are required for virus replication. IFN-␤ may also induce the synthesis of cellular microRNAs that target viral RNA genomes. The innate and adaptive immune systems do not function independently but work closely together to destroy a foreign invader. The dependence of adaptive immune responses on prior events orchestrated by cells of the innate system was an important tenet of Janeway’s original hypothesis (page 700). Most importantly, the same phagocytic cells and NK cells that carry out an immediate innate response are also responsible for initiating the much slower, more specific adaptive immune response. As we will see below, it is the nonspecific cells of the innate system that are responsible for activating only those specific cells of the adaptive system that are capable of dealing with the particular threat that is at hand.

include the protein and polysaccharide components of bacterial cell walls, bacterial toxins, and viral coat proteins. In some cases, antibodies can bind to a bacterial toxin or virus particle and directly prevent the agent from entering a host cell (Figure 17.2e). In other cases, antibodies function as “molecular tags” that bind to an invading pathogen and mark it for destruction. Bacterial cells coated with antibody molecules (Figure 17.2 f ) are rapidly ingested by wandering phagocytes or destroyed by complement molecules carried in the blood. Antibodies are not effective against pathogens that are present inside cells, hence the need for a second type of weapon system. Cell-mediated immunity is carried out by T lymphocytes (or T cells) that, when activated, can specifically recognize and kill an infected (or foreign) cell (Figure 17.2g). B and T cells arise from the same type of precursor cell (a hematopoietic stem cell ) in the bone marrow, but they differentiate along different pathways in different lymphoid organs. A summary of the various pathways of differentiation of the hematopoietic stem cell is shown in Figure 17.6. B lymphocytes differentiate in the fetal liver or adult bone marrow, whereas T lymphocytes differentiate in the thymus gland, an organ located in the chest that reaches its peak size during childhood. Because of these differences, cell-mediated and

Eosinophil

Basophil

Mast cell

Neutrophil

Dendritic cell

Adaptive Immune Responses

■

■

Humoral immunity, which is carried out by antibodies (Figure 17.2e,f ). Antibodies are globular, blood-borne proteins of the immunoglobulin superfamily (IgSF). Cell-mediated immunity, which is carried out by cells (Figure 17.2g).

Both types of adaptive immunity are mediated by lymphocytes, which are nucleated leukocytes (white blood cells) that circulate between the blood and lymphoid organs. Humoral immunity is mediated by B lymphocytes (or B cells) that, when activated, differentiate into cells that secrete antibodies. Antibodies are directed primarily against foreign materials that are situated outside the body’s cells. Such materials

Erythrocytes

Monocyte

Platelets

Macrophage

Myeloid progenitor cell

T cell

Hematopoietic stem cell

Lymphoid progenitor cell

B cell

Plasma cell

Bone Marrow NK cell

Figure 17.6 Pathways of differentiation of a hematopoietic stem cell of the bone marrow. A hematopoietic stem cell can give rise to two different progenitor cells: a myeloid progenitor cell that can differentiate into most of the various blood cells (e.g., erythrocytes, basophils, and neutrophils), macrophages, or dendritic cells; or a lymphoid progenitor cell that can differentiate into any of the various types of lymphocytes (NK cells, T cells, or B cells). T-cell precursors migrate to the thymus where they differentiate into T cells. In contrast, B cells undergo differentiation in the bone marrow. Cells in the various stages of B- and T-cell differentiation can be distinguished by the species of proteins at their cell surface and/or the transcription factors that determine the genes being expressed.

17.1 An Overview of the Immune Response

Unlike innate responses, adaptive immune responses require a lag period during which the immune system gears up for an attack against a foreign agent. Unlike innate responses, adaptive immune responses are highly specific and can discriminate between two very similar molecules. For example, the blood of a person who has just recovered from measles contains antibodies that react with the virus that causes measles, but not with a related virus, such as that which causes mumps. Unlike the innate system, the adaptive system also has a “memory,” which usually means that the person will not suffer again from the same pathogen later in life. Whereas all animals possess some type of innate immunity against microbes and parasites, only vertebrates are known to mount an adaptive response. There are two broad categories of adaptive immunity:

704

humoral immunity can be dissociated to a large extent. For instance, humans may suffer from a rare disease called congenital agammaglobulinemia in which humoral antibody is deficient and cell-mediated immunity is normal.

was suggested that an antigen wraps itself around an antibody molecule, molding the antibody into a shape capable of combining with that particular antigen. In this “instructive” model, the lymphocyte only gains its ability to produce a specific antibody after its initial contact with antigen. In 1955, Niels Jerne, a Danish immunologist, proposed a radically different mechanism. Jerne suggested that the body produces small amounts of randomly structured antibodies in the absence of any antigen. As a group, these antibodies would be able to combine with any type of antigen to which a person might someday be exposed. According to Jerne’s model, when a person is exposed to an antigen, the antigen combines with a specific antibody, which somehow leads to the subsequent production of that particular antibody molecule. Thus in Jerne’s model, the antigen selects those preexisting antibodies capable of binding to it. In 1957, the concept of antigen selection of antibodies was expanded into a comprehensive model of antibody formation by the Australian immunologist F. MacFarlane Burnet. Burnet’s clonal selection theory quickly gained widespread acceptance. An overview of the steps that occur during the clonal selection of B cells is shown in Figure 17.7. More detailed discussion of these events is provided later in the chapter. The clonal selection of T cells is described in the following section. The main features of B-cell clonal selection are the following:

REVIEW 1. Contrast the general properties of innate and adaptive immune responses. 2. List four types of innate immune responses. Which would be most effective against a pathogen inside an infected cell? 3. What is meant by the terms “humoral” and “cellmediated” immunity?

17.2 | The Clonal Selection Theory as It Applies to B Cells If a person is infected with a virus or exposed to foreign material, his or her blood soon contains a high concentration of antibodies capable of reacting with the foreign substance, which is known as an antigen. Most antigens consist of proteins or polysaccharides, but lipids and nucleic acids can also serve in this capacity. How can the body produce antibodies that react specifically with an antigen to which the body is exposed? In other words, how does an antigen induce an adaptive immune response? Initially, it was thought that antigens somehow instructed lymphocytes to produce complementary antibodies. It

Proliferation and commitment to formation of a specific antibody occurring in absence of antigen

1

1. Each B cell becomes committed to produce one species of

antibody. B cells arise from a population of undifferentiated and indistinguishable progenitor cells. As it differentiates, a B cell becomes committed as the result of DNA rearrangements (see Figure 17.18) to producing only one species of antibody molecule (Figure 17.7, step 1). Thousands of different DNA rearrangements are possible, so

Polysaccharide of bacterial capsule

Proliferation of antigen-specific B cell following lymphocyte selection

Plasma cells secreting antibodies capable of binding antigen

4

2

Chapter 17 The Immune Response

+

3

Stem cell Encapsulated bacterium Exposure to bacterium 5

Committed B lymphocytes with "samples" of their antibodies embedded in their plasma membrane

Memory cell remains behind for future encounter with antigen (responsible for long-term immunity)

Figure 17.7 The clonal selection of B cells by a thymus-independent antigen. The steps are described in the figure and also in the text.

705

that different B cells produce different antibody molecules. Thus, even though mature B cells appear identical under the microscope, they can be distinguished by the antibodies they produce.

Spleen

2. B cells become committed to antibody formation in the

1

Prepare spleen cells

B-cell suspension 2

Bead with covalently bound antigen A

Spleen cell bound to bead

absence of antigen. The basic repertoire of antibodyproducing cells that a person will possess throughout his or her lifetime is already present within the lymphoid tissues prior to stimulation by an antigen and is independent of the presence of foreign materials. Each B cell displays its particular antibody on its surface with the antigenreactive portion facing outward. As a result, the cell is coated with antigen receptors that can bind specifically with antigens having a complementary structure. Although most lymphoid cells are never required during a person’s lifetime, the immune system is primed to respond immediately to any antigen that a person may be exposed to. The presence of cells with different membrane-bound antibodies can be demonstrated experimentally, as shown in Figure 17.8. 3. Antibody production follows selection of B cells by antigen.

Bound cell

4a

Spleen cell that had bound to bead with antigen A

3

Spleen cells that did not bind to beads with antigen A 4b

Beads with bound antigen B Bound cell

Beads with bound antigen A

4. Immunologic memory provides long-term immunity. Not

all B lymphocytes that are activated by antigen differentiate into antibody-secreting plasma cells. Some remain in lymphoid tissues as memory B cells (Figure 17.7, step 5) that can respond rapidly at a later date if the antigen reappears in the body. Although plasma cells die off following removal of the antigenic stimulus, memory B cells may persist for a person’s lifetime. It has been demonstrated, for example, that elderly individuals who were alive during the 1918 influenza pandemic still contain

17.2 The Clonal Selection Theory as It Applies to B Cells

Figure 17.8 Experimental demonstration that different B cells contain a different membrane-bound antibody and that these antibodies are produced in the absence of antigen. In this experiment, B cells are prepared from a mouse spleen (step 1). In step 2, the spleen cells are passed through a column containing beads coated with an antigen (antigen A) to which the mouse had never been exposed. A tiny fraction of the spleen cells bind to the beads, while the vast majority of spleen cells pass directly through the column (shown in step 3). In step 4, spleen cells from the previous experiment are passed through one of two different columns: a column whose beads are coated with antigen A or a column whose beads are coated with an unrelated antigen (antigen B) to which the mouse had never been exposed. In step 4a, the spleen cells tested are those that had bound to the beads in the previous step. These cells are found to rebind to beads coated with antigen A, but do not bind to beads coated with antigen B. In step 4b, the spleen cells tested are those that did not bind to the beads in the previous step. None of these cells bind to beads coated with antigen A, but a tiny fraction binds to beads coated with antigen B.

In most cases, the activation of a B cell by antigen requires the involvement of T cells (discussed on pages 709 and 723). A few antigens, however, such as the polysaccharides present in bacterial cell walls, activate B cells by themselves; antigens of this type are described as thymusindependent antigens. For simplicity, we will restrict the discussion at this point in the chapter to a thymusindependent antigen. Suppose a person were to be exposed to Haemophilus influenzae type B, an encapsulated bacterium that can cause fatal meningitis. The capsule of these bacteria contains a polysaccharide that can bind to a tiny fraction of the body’s B cells (Figure 17.7, step 2). The B cells that bind the polysaccharide contain membrane-bound antibodies whose combining site allows them to interact specifically with that antigen. In this way, an antigen selects those lymphocytes that produce antibodies capable of interacting with that antigen. Antigen binding activates the B cell, causing it to proliferate (Figure 17.7, step 3) and form a population (or clone) of lymphocytes, all of which make the same antibody. Some of these activated cells differentiate into short-lived plasma cells that secrete large amounts of antibody molecules (Figure 17.7, step 4). Unlike their B-cell precursors (Figure 17.9a), plasma cells possess an extensive rough ER characteristic of cells that are specialized for protein synthesis and secretion (Figure 17.9b).

706

are generated by a process in which DNA segments are randomly combined. As a result, genes are invariably formed that encode antibodies that can react with the body’s own tissues, which could produce widespread organ destruction and subsequent disease. It is obviously in the best interest of the body to prevent the production of such proteins, which are called autoantibodies. As they develop, many of the B cells capable of producing autoantibodies are either destroyed or rendered inactive. As a result, the body develops an immunologic tolerance toward itself. As discussed in the Human Perspective on page 724, a breakdown of the tolerant state can lead to the development of debilitating autoimmune diseases. Several principles of the clonal selection theory can be illustrated by briefly considering the subject of vaccination.

Vaccination (a)

Chapter 17 The Immune Response

(b)

Figure 17.9 Comparison of the structure of a B cell (a) and a plasma cell (b). The plasma cell has a much larger cytoplasmic compartment than the B cell, with more mitochondria and an extensively developed rough endoplasmic reticulum. These characteristics reflect the synthesis of large numbers of antibody molecules by the plasma cell. (A-B: STEVE GSCHMEISSNER/PHOTO RESEARCHERS, INC.)

circulating B cells that are specific for the pathogen they were exposed to 90 years earlier. When stimulated by the same antigen, some of the memory B cells rapidly proliferate into plasma cells, generating a secondary immune response in a matter of hours rather than the days required for the original response (see Figure 17.13). 5. Immunologic tolerance prevents the production of antibodies against self. As discussed below, genes encoding antibodies

Edward Jenner practiced medicine in the English countryside at a time when smallpox was one of the most prevalent and dreaded diseases. Over the years, he noticed that the maids who tended the cows were typically spared the ravages of the disease. Jenner concluded that milkmaids were somehow “immune” to smallpox because they were infected at an early age with cowpox, a harmless disease they contracted from their cows. Cowpox produces blisters that resemble the pus-filled blisters of smallpox, but the cowpox blisters are localized and disappear, causing nothing more serious than a scar at the site of infection. In 1796, Jenner performed one of the most famous (and risky) medical experiments of all time. First, he infected an eight-year-old boy with cowpox and gave the boy time to recover. Six weeks later, Jenner intentionally infected the boy with smallpox by injecting pus from a smallpox lesion directly under the boy’s skin. The boy showed no signs of the deadly disease. Within a few years, thousands of people had become immune to smallpox by intentionally infecting themselves with cowpox. This procedure was termed vaccination, after vacca, the Latin word for cow. Jenner’s experiment was successful because the immune response generated against the virus that causes cowpox happens to be effective against the closely related virus that causes smallpox. Most modern vaccines contain attenuated pathogens, which are pathogens that are capable of stimulating immunity but have been genetically “crippled” so that they are unable to cause disease. Most vaccines currently in use are Bcell vaccines, such as that employed to fight tetanus. Tetanus results from infection by the anaerobic soil bacterium Clostridium tetani, which can enter the body through a puncture wound. As they grow, the bacteria produce a powerful neurotoxin that blocks transmission across inhibitory synapses on motor neurons, leading to sustained muscle contraction and asphyxiation. At two months of age, most infants are immunized against tetanus by inoculation with a modified and harmless version of the tetanus toxin (called a toxoid ). The tetanus toxoid binds to the surfaces of B cells whose membrane-bound antibody molecules have a complementary binding site. These B cells proliferate to form a clone of cells that produce antibodies capable of

707

binding to the actual tetanus toxin. This initial response soon wanes, but the person is left with memory cells that respond rapidly if the person should happen to develop a C. tetani infection at a later date. Unlike most immunizations, immunity to the tetanus toxin does not last a lifetime, which is the reason that people are given a booster shot every ten years or so. The booster shot contains the toxoid protein and stimulates the production of additional memory cells. What if a person receives a wound that has the potential to cause tetanus, and they cannot remember ever receiving a booster shot? In these cases, the person is likely to be given a passive immunization, consisting of antibodies that can bind the tetanus toxin. Passive immunization is effective for only a short period of time and does not protect the recipient against a subsequent infection.

for this function. Professional APCs include dendritic cells and macrophages (Figure 17.10). We will focus on dendritic cells (DCs), which were first discovered and characterized by Ralph Steinman of Rockefeller University in the early 1970s and are often described as the “sentinels” of the immune system. Dendritic cells have earned this title because they “stand guard” in the body’s peripheral tissues where pathogens are likely to en-

REVIEW 1. Contrast instructive and selective mechanisms of antibody production. 2. What are the basic tenets of the clonal selection theory? 3. What does it mean for a B cell to become committed to antibody formation? How is this process influenced by the presence of antigen? What role does antigen play in antibody production? 4. What is meant by the terms immunologic memory and immunologic tolerance?

(a)

17.3 | T Lymphocytes: Activation and Mechanism of Action

(b)

Figure 17.10 Professional antigen-presenting cells (APCs). (a) Colorized scanning electron micrograph of a Kupffer cell (a type of macrophage) as it penetrates openings in the endothelium that lines the sinusoidal vessels of the liver. These cells are capable of ingesting aged red blood cells and pathogens and accumulate at sites of infection or injury. (b) Colorized SEM of a dendritic cell. These irregular-shaped cells, which are characteized by long cytoplasmic (or dendritic) processes, are concentrated in the tissues that line the body’s surfaces with the external environment. (A: COURTESY OF THOMAS DEERINCK, NCMIR, UNIVERSITY OF CALIFORNIA, SAN DIEGO; B: DAVID SCHARF/PHOTO RESEARCHERS, INC.)

17.3 T Lymphocytes: Activation and Mechanism of Action

Like B cells, T cells are also subject to a process of clonal selection. T cells possess a cell-surface protein, called a T-cell receptor, that allows them to interact specifically with a particular antigen. Like the antibody molecules that serve as B-cell receptors, the proteins that serve as T-cell receptors exist as a large population of molecules that have differently shaped combining sites. Just as each B cell produces only one species of antibody, each T cell has only a single species of T-cell receptor. It is estimated that adult humans possess approximately 1012 T cells that, collectively, exhibit more than 107 different antigen receptors. T cells are activated by fragments of antigens that are displayed on the surfaces of other cells, called antigen-presenting cells (APCs). Consider what would happen if a liver or kidney cell were infected with a virus. The infected cell would display portions of the viral proteins on its surface (see Figure 17.24), enabling the infected cell to bind to a T cell with the appropriate (cognate) T-cell receptor. As a result of this presentation, the immune system becomes alerted to the entry of a specific pathogen. The process of antigen presentation is discussed at length later in the chapter (page 717) and is also the subject of the Experimental Pathways. Whereas any infected cell can serve as an APC in activating T cells, certain types of “professional” APCs are specialized

708

ter (such as the skin and airways). DCs are equipped with a wide variety of receptors capable of recognizing virtually every type of pathogen. This property makes DCs a major component of the innate immune system (page 700). As discussed in the following paragraphs, DCs utilize the information about the pathogens they ingest to initiate a response from the appropriate cells of the adaptive immune system. When present in the body’s peripheral tissues, immature DCs recognize and ingest microbes and other foreign materials by phagocytosis. Once a microbe is taken into a dendritic cell, it has to be processed before its components can be presented to another cell. Antigen processing requires that the ingested material is fragmented enzymatically in the cytoplasm and the fragments moved to the cell surface (see Figure 17.23). DCs that have processed antigen migrate to nearby lymph nodes where they differentiate into mature antigenpresenting cells. Once in a lymph node, DCs come into contact with a large pool of T cells, including a minute percentage whose T-cell receptors can bind specifically with the processed foreign antigen, which activates the T cell. This dynamic process of DC-T cell interactions has been visualized recently in living lymph-node tissues by microscopic imaging of fluorescently labeled cells (Figure 17.11). In the absence of antigen, a given DC may interact transiently with as many as 500 to 5000 different T cells, remaining in contact with each cell for only a few minutes (as in Figure 17.11). In contrast, when the DC displays an antigen that is specifically recognized by the TCR of the T cell, the interaction between the cells is seen to last a period of hours, leading to the activation of the T cell,

Chapter 17 The Immune Response

(a)

(b)

(c)

as evidenced by a transient increase in the cytosolic Ca2⫹ concentration. Once activated, a T cell proliferates to form a clone of T cells having the same T-cell receptor. It is estimated that a single activated T cell can divide three to four times per day for several days, generating a tremendous population of T cells capable of interacting with the foreign antigen. The massive proliferation of specific T lymphocytes in response to an infection is often reflected in the enlargement of local lymph nodes. Once the foreign antigen has been cleared, the vast majority of the expanded T-cell population dies by apoptosis, leaving behind a relatively small population of memory T cells capable of responding rapidly in the event of future contact with the same pathogen. Unlike B cells, which secrete antibodies, T cells carry out their assigned function through direct interactions with other cells, including B cells, other T cells, or target cells located throughout the body. This cell–cell interaction may lead to the activation, inactivation, or death of the other cell. In addition to direct cell contact, many T-cell interactions are mediated by highly active chemical messengers, called cytokines, that work at very low concentrations. Cytokines are small secreted proteins produced by a wide variety of cells and include interferons (IFNs), interleukins (ILs), and tumor necrosis factors (TNFs). Cytokines bind to specific receptors on the surface of a responding cell, generating an internal signal that alters the activity of the cell. In responding to a cytokine, a cell may prepare to divide, undergo differentiation, or secrete its own cytokines. One family of small cytokines, called chemokines, act primarily

Figure 17.11 Live cell imaging of DCs and T cells, and their interactions, within a lymph node. T cells travel from lymph node to lymph node within the body. When they enter a lymph node, they migrate within the tissue as their surface is scanned by individual DCs with which they come into contact. (a) Several fluorescently labeled T cells (stained green, 3 of which are indicated by labeled numbers) are seen moving around within a lymph node and coming in contact with an individual DC (stained red and indicated by the asterisk) over a period of 2.5 minutes. (b) The trajectory taken by each of the three numbered T cells and the asteriskmarked DC shown in part a. (c) Contacts within a lymph node between a single DC (green) and several T cells (orange) rendered in three dimensions. Contacts between the cells are dynamic, changing rapidly in size over a period of tens of seconds. (A–B: FROM PHILIPPE BOUSSO AND ELLEN ROBEY, NATURE IMMUNOL. 4:581, 2003. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LTD.; C: FROM MARK J. MILLER ET AL., COURTESY OF MICHAEL D. CAHALAN, PROC. NAT’L. ACAD. SCI. U.S.A. 101:1002, 2004 © 2004, NATIONAL ACADEMY OF SCIENCES, U.S.A.)

709

Table 17.1 Selected Cytokines Cytokine

IL-1

IL-2 IL-4

Source

Diverse

TH cells T cells

IL-5

TH cells

IL-10

T cells, macrophages

IFN-␥

TH cells, CTLs

TNF-␣

Diverse

GM-CSF

TH cells, CTLs

Antigen

+

Major functions

Induces inflammation, stimulates TH-cell proliferation Stimulates T-cell and B-cell proliferation Induces IgM to IgG class switching in B cells, suppresses inflammatory cytokine action Stimulates B-cell differentiation Inhibits macrophage function, suppresses inflammatory cytokine action Induces MHC expression in APCs, activates NK cells Induces inflammation, activates NO production in macrophages Stimulates growth and proliferation of granulocytes and macrophages

T-cell receptor (TCR)

Antigen processed inside macrophage

Macrophage (or dendritic cell)

and portions displayed at cell surface Macrophage 2

1

T-helper cell

Antigen receptor (membrane-bound antibody) Cytokines e.g.,IL-4, IL-5 and IL-6 Proliferation of activated B cell B cell 3

Activated T-helper cell

4

Plasma cell secreting antibody 5

as chemoattractants that stimulate the migration of lymphocytes into inflamed tissue. Different types of lymphocytes and phagocytes possess receptors for different chemokines, so that their migration patterns can be separately controlled. A list of some of the best studied cytokines is given in Table 17.1. Three major subclasses of T cells can be distinguished by the proteins on their surfaces and their biological functions. 1. Cytotoxic T lymphocytes (CTLs) screen the cells of the

protein CD4 at their surface rather than CD8.1 TH cells are activated by professional antigen-presenting cells, such as dendritic cells and macrophages as shown in Figure 17.12. This is one of the first and most important steps in initiating an adaptive immune response. Nearly 1

There are three major subtypes of helper T cells, TH1, TH2, and TH17 cells, which can be distinguished by the cytokines they secrete. All of these helper T cells stimulate B cells to produce antibodies, but each also has a unique function. TH1 cells produce IFN-␥, which protects the body by activating macrophages to kill intracellular pathogens they might harbor (page 374). TH2 cells produce IL-4, which mobilizes mast cells, basophils, and eosinophils to protect against extracellular pathogens, especially parasitic worms. TH17 cells produce IL-17, which is thought to stimulate epithelial cells to recruit phagocytes and thereby prevent the entry of extracellular bacteria and fungi into the body. TH17 cells are also implicated in the development of autoimmune disease. The three subtypes of TH cells differentiate from a common precursor following stimulation by different cytokines and are defined by the presence of different “master” transcription factors (page 518).

17.3 T Lymphocytes: Activation and Mechanism of Action

body for abnormalities. Under normal circumstances, healthy cells are not harmed by CTLs, but aged or infected cells, and possibly some malignant cells, are attacked and killed. CTLs kill target cells by inducing them to undergo apoptosis. Two distinct cell-killing pathways have been described. In one pathway, the CTL releases perforins and granzymes into a tightly enclosed space between the cells. Perforins are proteins that assemble within the membrane of the target cell to form transmembrane channels. Granzymes are proteolytic enzymes that enter the perforin channels and activate caspases, which are the proteolytic enzymes that initiate the apoptotic response (page 656). In the alternate pathway, the CTL binds to a receptor on the target cell surface, activating a suicide pathway in the target cell similar to that of Figure 15.39. By killing infected cells, CTLs eliminate viruses, bacteria, yeast, protozoa, and parasites after they have found their way into host cells and are no longer accessible to circulating antibodies. CTLs possess a surface protein called CD8 (cluster designation 8) and are referred to as CD8⫹ cells. 2. Helper T lymphocytes (TH cells) activate other immune cells in an orchestrated attack against a specific pathogen. They are distinguished from CTLs by the presence of the

Figure 17.12 Highly simplified, schematic drawing showing the role of TH cells in antibody formation. In step 1, the macrophage interacts with the complex antigen. The antigen is taken into the macrophage and cleaved into fragments, which are displayed at the cell surface. In step 2, the macrophage interacts with a TH cell whose TCR is bound to one of the displayed antigen fragments (the green membrane protein is an MHC molecule, page 716). This interaction activates the T cell. In step 3, the activated TH cell interacts with a B cell whose antigen receptor is bound to an intact, soluble antigen. B-cell activation is stimulated by cytokines (e.g., IL-4, IL-5, and IL-6) released by the TH cell into the space separating it from the adjacent B cell. Interaction with the TH cell activates the B cell, causing it to proliferate (step 4). The progeny of the activated B cell differentiate into plasma cells that synthesize antibodies that can bind the antigen (step 5).

710

all B cells require the help of TH cells before they can mature and differentiate into antibody-secreting plasma cells. B cells are activated by direct interaction with a TH cell that is specific for the same antigen, as shown in Figure 17.12 (and in more detail in Figure 17.26). Thus, the formation of antibodies requires the activation of both T cells and B cells capable of interacting specifically with the same antigen. The importance of TH cells becomes apparent when one considers the devastating effects wrought by HIV, the virus that causes AIDS. TH cells are the primary targets of HIV. Most HIV-infected people remain free of symptoms as long as their TH-cell count remains relatively high—above 500 cells/␮l of blood (the normal count is over 1000 cells/␮l). Once the count drops below approximately 200 cells/␮l, a person develops fullblown AIDS and becomes prone to attack by viral and cellular pathogens. 3. Regulatory T lymphocytes (TReg cells) are primarily inhibitory cells that suppress the proliferation and activities of various types of immune cells. Virtually all of the suppressive activities of TReg cells have been studied in vitro so it remains much less clear how these cells operate in the body. TReg cells are characterized by possession of CD4⫹CD25⫹ surface markers and are thought to play an important role in limiting inflammation and maintaining immunologic self-tolerance. TReg cells carry out this latter activity by suppressing CTLs that carry self-reactive receptors from attacking the body’s own cells. TReg cells appear also to protect a fetus from immunological attack by the pregnant mother. On the other hand, these same cells may prove detrimental to our health by preventing the immune system from ridding the body of tumor cells. The differentiation of TReg cells requires stimulation by the cytokine IL-2 and leads to the expression by the cells of a key transcription factor, FOXP3. Mutations in the FOXP3 gene result in a fatal disease (IPEX), which is

Table 17.2 Classes of Human Immunoglobulins Heavy chain

Light chain

Molecular mass (kDa)

IgA

␣

␬ or ␭

360–720

IgD

␦

␬ or ␭

160

IgE

⑀

␬ or ␭

190

IgG

␥

␬ or ␭

150

IgM

␮

␬ or ␭

950

Chapter 17 The Immune Response

Class

Properties

Present in tears, nasal mucus, breast milk, intestinal secretions Present in B-cell plasma membranes; function uncertain Binds to mast cells, releasing histamine responsible for allergic reactions Primary blood-borne soluble antibodies; crosses placenta Present in B-cell plasma membranes; mediates initial immune response; activates bacteria-killing complement

characterized by severe autoimmunity in newborn infants. Defects in TReg cells are widely suspected of playing a key role in the development of most autoimmune diseases. Studies of TReg cells have provided direct demonstration that homeostasis in the immune system requires a tight balance between stimulatory and inhibitory influences.

REVIEW 1. How does an infected cell in the body reveal its condition to a T cell? What is the T cell’s response? 2. What is an APC? What types of cells can act as APCs? 3. Compare and contrast the properties and functions of a TH cell and a CTL. A TH cell and a TReg cell.

17.4 | Selected Topics on the Cellular and Molecular Basis of Immunity The Modular Structure of Antibodies Antibodies are proteins produced by B cells and their descendants (plasma cells). B cells incorporate antibody molecules into their plasma membrane, where they serve as antigen receptors, whereas plasma cells secrete these proteins into the blood or other bodily fluids, where they serve as a molecular arsenal in the body’s war against invading pathogens. Interaction between blood-borne antibodies and antigens on the surface of a virus or bacterial cell can neutralize the pathogen’s ability to infect a host cell and facilitate the pathogen’s ingestion and destruction by wandering phagocytes. The immune system produces millions of different antibody molecules that, taken together, can bind any type of foreign substance to which the body may be exposed. Though the immune system exhibits great diversity through the antibodies it produces, a single antibody molecule can interact with only one or a few closely related antigenic structures. Antibodies are globular proteins called immunoglobulins. Immunoglobulins are built of two types of polypeptide chains, larger heavy chains (molecular mass of 50,000 to 70,000 daltons) and smaller light chains (molecular mass of 23,000 daltons). The two types of chains are linked to one another in pairs by disulfide bonds. Five different classes of immunoglobulin (IgA, IgD, IgE, IgG, and IgM) have been identified. The different immunoglobulins appear at different times after exposure to a foreign substance and have different biological functions (Table 17.2). IgM molecules are the first antibodies secreted by B cells following stimulation by an antigen, appearing in the blood after a lag of a few days (Figure 17.13). IgM molecules have a relatively short half-life (about 5 days), and their appearance is followed by secretion of longer-lived IgG and/or IgE molecules. IgG molecules are the predominant antibodies found in the blood and lymph during a secondary response to most antigens (Figure 17.13). IgE molecules are produced at high levels in response to many

711 Antibody level in blood

104

Primary response

As described in Chapter 16, cancer is a monoclonal disease; that is, the cells of a tumor arise from the proliferation of a single wayward cell. Because a single lymphocyte normally produces only a single species of antibody, a patient with multiple myeloma produces large amounts of the specific antibody that is synthesized by the particular cell that became malignant.

Secondary response

103 102 101

lgM

lgG

100 0 1

2

3

1

Time (weeks) Initial stimulus with antigen

VL

2

VL

VH

Secondary stimulus with same antigen VH

Figure 17.13 Primary and secondary antibody responses. A primary response, which is elicited by an initial exposure to an antigen, leads first to the production of soluble IgM antibody molecules, followed by production of soluble IgG antibody molecules. When the antigen is reintroduced at a later time, a secondary response is initiated. In contrast to the primary response, the secondary response begins with production of IgG (as well as IgM) molecules, leads to a much higher antibody level in the blood, and occurs with almost no delay. (a) Fab fragments

S S

S

CH1

S

S

S

S

S S

S

S

S S

S

S

S

CL

S

S

VL

Light chain

S S

CH2

S S

CH3

S S

Fc fragment

(b)

Figure 17.14 Antibody structure. (a) Ribbon model of an IgG molecule. The molecule contains four polypeptide chains: two identical light chains and two identical heavy chains. One of the heavy chains is shown in blue, the other in yellow, while both light chains are shown in red. The domains of each chain (two per light chain and four per heavy chain) are evident. (b) Schematic model showing the domain structure of an IgG molecule. The tertiary structure of each Ig domain is maintained by a disulfide bond. Domains comprising a constant region of the polypeptide chain are indicated by the letter C; domains comprising a variable region are indicated by the letter V. Each heavy chain contains three CH regions (CH1, CH2, CH3) and one VH region at the N-terminus of the polypeptide. Each light chain contains one CL and one VL region at its N-terminus. The variable regions of each light and heavy chain form an antigen-combining site. Each Y-shaped IgG molecule contains two identical antigen-combining sites. Each IgG molecule can be fragmented by mild proteolytic treatment into two Fab fragments that contain the antigen-combining sites and an Fc fragment, as indicated. (A: COURTESY OF ALEXANDER MCPHERSON.)

17.4 Selected Topics on the Cellular and Molecular Basis of Immunity

Hinge region S S

Humans actually produce four related heavy chains as part of their IgG molecules (forming IgG1, IgG2, IgG3, and IgG4) and two related heavy chains as part of their IgA molecules (forming IgA1 and IgA2) (see Figure 17.19). These differences will not be mentioned in the following discussion.

S

S

2

S

S

parasitic infections. IgE molecules are also bound with high affinity to the surface of mast cells, triggering histamine release, which causes inflammation and symptoms of allergy. IgA is the predominant antibody in secretions of the respiratory, digestive, and urogenital tracts and acts to protect these mucosal linings from pathogens. The function of IgD is unclear. There are two types of light chains, kappa (␬) chains and lambda (␭) chains, both of which are present in the immunoglobulins of all five classes. In contrast, each immunoglobulin class has a unique heavy chain that defines that class (Table 17.2).2 We will focus primarily on the structure of IgGs. An IgG molecule is composed of two identical light chains and two identical heavy chains arranged to form a Y-shaped molecule, as shown in Figure 17.14a and described below. To determine the basis of antibody specificity, it was first necessary to determine the amino acid sequence of a number of specific antibodies. Normally, the first step in amino acid sequencing is to purify the particular protein to be studied. Under normal conditions, however, it is impossible to obtain a purified preparation of a specific antibody from the blood because each person produces a large number of different antibody molecules that are too similar in structure to be separated from one another. The problem was solved when it was discovered that the blood of patients suffering from a type of lymphoid cancer called multiple myeloma contained large quantities of a single antibody molecule.

Heavy chain

VH

712

One patient produces one highly abundant antibody species, and other patients produce different antibodies. As a result, investigators were able to purify substantial quantities of several antibodies from a number of patients and compare their amino acid sequences. An important pattern was soon revealed. It was found that half of each kappa light chain has a constant amino acid sequence among all kappa chains, whereas the other half varies from patient to patient. A similar comparison of amino acid sequences of several lambda chains from different patients revealed that they too consist of a section of constant sequence and a section whose sequence varies from one immunoglobulin to the next. The heavy chains of the purified IgGs also contain a variable (V) and a constant (C) portion. A schematic structure of one of these IgG molecules is shown in Figure 17.14b. It was further found that, whereas approximately half of each light chain consists of a variable region (VL), only onequarter of each heavy chain is variable (VH) among different patients; the remaining three-quarters of the heavy chain (CH) is constant for all IgGs. The constant portion of the heavy chain can be divided into three sections of approximately equal length that are clearly homologous to one another. These homologous Ig units are designated CH1, CH2, and CH3 in Figure 17.14b. It appears that the three sections of the C part of the IgG heavy chain (as well as those of heavy chains of the other Ig classes and the C portions of both kappa and lambda light chains) arose during evolution by the duplication of an ancestral gene that coded for an Ig unit of approximately 110 amino acids. The variable regions (VH or VL) are also thought to have arisen by evolution from the same ancestral Ig unit. Structural analysis indicates that each of the homologous Ig units of a light or heavy chain folds independently to form a compact domain that is held together by

NH2 Disulfide bond

Hv Hv Hv

Chapter 17 The Immune Response

COOH

CL Domain

VL Domain

Figure 17.15 Antibody domains. Schematic drawing of a human lambda light chain synthesized by cells from a patient with multiple myeloma. The polypeptide undergoes folding so that the constant and variable portions are present in separate domains. Thick arrows represent ␤ strands, which are assembled into ␤ sheets. Each domain has two ␤ sheets, which are distinguished by the red and orange colors. The three hypervariable (Hv) segments of the chain are present as loops at one end of the variable domain, which forms part of the antigen-combining site of the antibody. (REPRINTED (ADAPTED) WITH PERMISSION FROM SCHIFFER ET AL., BIOCHEMISTRY 12:4628, 1973. COPYRIGHT 1973. AMERICAN CHEMICAL SOCIETY.)

a disulfide bond (Figure 17.15). In an intact IgG molecule, each light chain domain associates with a heavy chain domain as shown in Figure 17.14a,b. Genetic analysis indicates that each domain is encoded by its own exon. The specificity of an antibody is determined by the amino acids of the antigen-combining sites at the ends of each arm of the Y-shaped antibody molecule (Figure 17.14). The two combining sites of a single IgG molecule are identical, and each is formed by the association of the variable portion of a light chain with the variable portion of a heavy chain (Figure 17.14). Assembly of antibodies from different combinations of light and heavy chains allows a person to produce a tremendous variety of antibodies from a modest number of different polypeptides (page 715). A closer look at the polypeptides of immunoglobulins reveals that the variable portions of both heavy and light chains contain subregions that are especially variable, or hypervariable, from one antibody to another (labeled Hv in Figure 17.15). Light and heavy chains both contain three hypervariable stretches that are clustered at the ends of each arm on the antibody molecule. As might be expected, the hypervariable regions play a prominent role in forming the structure of the antigen-combining site, which can range from a deep cleft to a narrow groove or a relatively flat pocket. Variations in the amino acid sequence of hypervariable regions account for the great diversity of antibody specificity, allowing these molecules to bind to antigens of every conceivable shape. The combining site of an antibody has a complementary stereochemical structure to a particular portion of the antigen, which is called the epitope (or antigenic determinant). Because of their close fit, antibodies and antigens form stable complexes, even though they are joined only by noncovalent forces that individually are quite weak. The precise interaction between a particular antigen and antibody, as determined by X-ray crystallography, is shown in Figure 17.16. The two hinge regions within the molecule (Figure 17.14) provide the flexibility necessary for the antibody to bind to two separate antigen molecules or to a single molecule with two identical epitopes. Whereas the hypervariable portions of light and heavy chains determine an antibody’s combining site specificity, the remaining portions of the variable domains provide a scaffold that maintains the overall structure of the combining site. The constant portions of antibody molecules are also important. Different classes of antibodies (IgA, IgD, IgE, IgG, and IgM) have different heavy chains, whose constant regions differ considerably in length and sequence. These differences enable antibodies of various classes to carry out different biological (effector) functions. For example, heavy chains of an IgM molecule bind and activate one of the proteins of the complement system, which leads to the lysis of bacterial cells to which the IgM molecules are bound. Heavy chains of IgE molecules play an important role in allergic reactions by binding to specific receptors on the surfaces of mast cells, triggering the release of histamine. In contrast, heavy chains of an IgG molecule bind specifically to the surface receptors of macrophages and neutrophils, inducing these phagocytic cells to ingest the particle to which the antibodies are bound. The

713

glutamine residue of lysozyme is red. (FROM A. G. AMIT, R. A. MARIUZZA, S. E. V. PHILLIPS, AND R. J. POLJAK, SCIENCE 233:749, 1986; © 1986, REPRINTED WITH PERMISSION FROM AAAS.)

heavy chains of IgG molecules are also important in allowing this class of antibodies to pass from the blood vessels of a mother to those of her fetus during pregnancy. While this provides the fetus and newborn with passive immunity to infectious organisms, it may cause a life-threatening condition called erythroblastosis fetalis. For this condition to occur, an Rh⫺ mother must have given birth to a child with an Rh⫹ phenotype (Rh⫹/Rh⫺ genotype) during a previous pregnancy. The mother is usually exposed to the Rh⫹ fetal antigen during delivery of the first child, who is not affected. If, however, the mother should have a second Rh⫹ pregnancy, antibodies present in her bloodstream can enter the fetal circulation and destroy the red blood cells of the fetus. Babies with this condition are given a blood transfusion, either before (intrauterine) or after birth, which cleanses the blood of maternal antibodies.

a research institute in Basel, Switzerland, provided clear evidence in favor of the DNA rearrangement hypothesis. The basic outline of the experiment is shown in Figure 17.17. In this experiment, Tonegawa and his colleagues compared the length of DNA between the nucleotide sequences encoding the C and V portions of a specific antibody chain in two different types of mouse cells: early embryonic cells and malignant, antibody-producing myeloma cells. The DNA segments encoding the C and V portions of the antibody were widely separated in embryonic DNA but were very close to each other in DNA obtained from antibody-producing myeloma cells (Figure 17.17). These findings strongly suggested that DNA segments encoding the parts of antibody molecules became rearranged during the formation of antibody-producing cells. Subsequent research revealed the precise arrangement of the DNA sequences that give rise to antibody genes. To simplify the discussion, we will consider only those DNA sequences involved in the formation of human kappa light chains, which are located on chromosome 2. The organization of sequences in germ-line DNA (i.e., the DNA of a sperm or egg) that are involved in the formation of human kappa light chains is shown in the top line (step 1) of Figure 17.18. In this case, a variety of different V␬ genes are located in a linear array and separated from a single C␬ gene by some distance. Nucleotide-sequence analysis of these V genes indicated that they are shorter than required to encode the V region of the kappa light chain. The reason became clear when other segments in the region were sequenced. The stretch of nucleotides that encodes the 13 amino acids at the carboxyl end of a V region is located at some distance from the remainder of the V␬ gene sequence. This small portion that encodes the carboxyl end of the V region is termed the J segment. As shown in Figure 17.18 there are five distinct J␬ segments of related nucleotide sequence arranged in tandem. The cluster of J␬ segments is separated from the C␬ gene by an additional

DNA Rearrangements that Produce Genes Encoding B- and T-Cell Antigen Receptors As discussed above, each IgG molecule is composed of two light (L) chains and two heavy (H) chains. Both types of polypeptides consist of two recognizable parts—a variable (V) portion, whose amino acid sequence varies from one antibody species to another, and a constant (C) portion, whose amino acid sequence is identical among all H or L chains of the same class. What is the genetic basis for the synthesis of polypeptides having a combination of shared and unique amino acid sequences? In 1965, William Dreyer of the California Institute of Technology and J. Claude Bennett of the University of Alabama put forward the “two gene–one polypeptide” hypothesis to account for antibody structure. In essence, Dreyer and Bennett proposed that each antibody chain is encoded by two separate genes—a C gene and a V gene—that somehow combine to form one continuous “gene” that codes for a single light or heavy chain. In 1976, Susumu Tonegawa, working at

17.4 Selected Topics on the Cellular and Molecular Basis of Immunity

Figure 17.16 Antigen–antibody interaction. Space-filling model based on X-ray crystallography of a complex between lysozyme (green) and the Fab portion of an antibody molecule (see Figure 17.14). The heavy chain of the antibody is blue; the light chain is yellow. A

714 1

Germ-line DNA

Extract DNA, treat with restriction enzyme

V1

V2

V3

V4

V5

Vn

J1 J2 J3J4J5

C

2.5 Kb

Vn

1

Embryonic cell

V5

2

J1 V1

V2

V3

V4

C

J2 J3J4J5

Extract DNA, treat with restriction enzyme

Vn

Antibody - producing cells

V5

3

J1

B-cell DNA 2

V1

Separate DNA fragments by gel electrophoresis. Two identical gels are prepared from each DNA sample.

V2

V3

V4 J2 J3J4J5

C

Transcription 4

V4 J2 J3J4J5

C

Primary transcript

5 3

a

b

c d Following electrophoresis, incubate gels a and c with a radioactively labeled C gene probe and gels b and d with a radioactively labeled V gene probe. Locate sites of hybridized labeled DNA by autoradiography. 9 kb C fragment 6 kb V fragment

C V

Chapter 17 The Immune Response

a

C V b c d Positions of bands in gels

3 kb C+V fragment

V4 J2 C

Mature mRNA

C V C+V

Positions of genes in DNA restriction fragments

Figure 17.17 Experimental demonstration that genes encoding antibody light chains are formed by DNA rearrangement. DNA is first extracted from either embryonic cells or antibody-producing cancer cells and fragmented by a restriction endonuclease (step 1) that cleaves both strands of the DNA at a specific sequence. The DNA fragments from the two preparations are fractionated separately by gel electrophoresis; two identical gels are prepared from each DNA sample (step 2). Following electrophoresis, each of the gels is incubated with a labeled probe containing either the variable (V) or constant (C) gene sequence (step 3). The location of the bound, labeled DNA in the gel is revealed by autoradiography and shown at the bottom of the figure. Whereas the V and C gene sequences are located on separate fragments in DNA from embryonic cells, the two sequences are located on the same small fragment in DNA from the antibody-producing cells. The V and C gene sequences are brought together during B-cell development by a process of DNA rearrangement.

Figure 17.18 DNA rearrangements that lead to the formation of a functional gene that encodes an immunoglobulin ␬ chain. The organization of the variable (V), joining ( J), and constant (C) DNA sequences within the genome is shown in step 1. Steps leading to the synthesis of the mature mRNA that encodes the ␬ chain polypeptide are described in the text. The random union of a V and J segment (steps 2 and 3) determines the amino acid sequence of the polypeptide. The space between the “chosen” J segment and the C segment (which may contain one or more J segments, as shown in the figure) remains as an intron in the gene. The portion of the primary transcript (step 4) that corresponds to this intron is removed during RNA processing (step 5). DNA rearrangement and subsequent transcription of the “stitched” gene occurs on only one allele in each cell, which ensures that the cell will only express a single ␬ chain. The other allele on the homologous chromosome typically remains unaltered and is not transcribed.

stretch of over 2000 nucleotides. A complete kappa V gene is formed, as shown in Figure 17.18 (steps 2–3), as a specific V␬ gene is joined to one of the J␬ segments with the intervening DNA excised. This process is catalyzed by a protein complex called V(D)J recombinase. As indicated in Figure 17.18, the V␬ gene sequence generated by this DNA rearrangement is still separated from the C␬ gene by more than 2000 nucleotides. No further DNA rearrangement occurs in kappa gene assembly prior to transcription; the entire genetic region is transcribed into a large primary transcript (step 4) from which introns are excised by RNA splicing (step 5). Rearrangement begins as double-stranded cuts are made in the DNA between a V gene and a J gene. The cuts are

715

million different antibody species from a few hundred genetic elements present in the germ line.3 We have seen how antibody diversity arises from (1) the presence of multiple V exons, J exons, and D exons in the DNA of the germ line, (2) variability in V–J and V–D–J joining, and (3) the enzymatic insertion of nucleotides. An additional mechanism for generating antibody diversity, referred to as somatic hypermutation, occurs long after DNA rearrangement is complete. When a specific antigen is reintroduced into an animal following a period of time, antibodies produced during the secondary response have a much greater affinity for the antigen than those produced during the primary response. Increased affinity is due to small changes in amino acid sequence of variable regions of the heavy and light antibody chains. These sequence changes result from mutations in the genes that encode these polypeptides. It is estimated that rearranged DNA elements encoding antibody V regions have a mutation rate 105 times greater than that of other genetic loci in the same cell. Included in the mechanism responsible for this increased level of V-region mutation is (1) an enzyme—known as activationinduced cytosine deaminase (AID)—that converts cytosine residues in DNA into uracil residues leading to U:G mismatches and subsequent mutations during DNA repair and (2) one or more translesion DNA polymerases (page 569) that tend to make errors when DNA containing uracils is copied or repaired. Persons who carry mutations in AID and are unable to generate somatic hypermutation are plagued by infections and often die at an early age. Somatic hypermutation generates random changes in the V regions of Ig genes. Those B cells whose genes produce Ig molecules with greater antigen affinity are preferentially selected following antigen reexposure. Selected cells proliferate to form clones that undergo additional rounds of somatic mutation and selection, whereas nonselected cells that express low affinity Igs undergo apoptosis. In this way, the antibody response to recurrent or chronic infections improves markedly over time. Once a B cell is committed to form a specific antibody, it can switch the class of Ig it produces (e.g., from IgM to IgG) by changing the heavy chain produced in the cell. This process, known as class switching, occurs without changing the combining site of the antibodies synthesized. Recall that there are five different types of heavy chains distinguished by their constant regions. The genes that encode the constant regions of heavy chains (CH portions) are clustered together in a complex, as shown in Figure 17.19. Class switching is accomplished by moving a different CH gene next to the VDJ gene 3

A roughly comparable number of antibodies containing lambda light chains can also be generated.

V D J Cμ Cδ

Cγ3

Cγ1

Cα1 Cγ2 Cγ4

Cε Cα2

Figure 17.19 Arrangement of the C genes for the various human heavy chains. In humans, the heavy chains of IgM, IgD, and IgE are encoded by a single gene, whereas those of IgG are encoded by four different genes and IgA by two different genes (see footnote, page 711).

17.4 Selected Topics on the Cellular and Molecular Basis of Immunity

catalyzed by a pair of proteins called RAG1 and RAG2 that are the working parts of the V(D)J recombinase. The four free ends that are generated are then joined in such a way that the V and J coding segments are linked to form an exon that encodes the variable region of the polypeptide chain, while the two ends of the intervening DNA are linked to form a small circular piece of DNA that is displaced from the chromosome (Figure 17.18, step 3). Joining of the broken DNA ends is accomplished by the same basic process used to repair DNA strand breaks that was depicted in Figure 13.28. The rearrangement of Ig DNA sequences has important consequences for a lymphocyte. Once a specific V␬ sequence is joined to a J␬ sequence, no other species of kappa chain can be synthesized by that cell. It is estimated that the DNA of human germ cells contains approximately 40 functional V␬ genes. Thus, if we assume that any V sequence may join to any J sequence, we expect that a person can synthesize approximately 200 different kappa chains (5 J␬ segments ⫻ 40 V␬ genes). But this is not the only source of diversity among these polypeptides. The site at which a J sequence is joined to a V sequence can vary somewhat from one rearrangement to another so that the same V␬ and J␬ genes can be joined in two different cells to produce kappa light chains having different amino acid sequences. Additional variability is achieved by the enzyme deoxynucleotidyl transferase, which inserts nucleotides at the sites of strand breakage. These sources of additional variability increase the diversity of kappa chains an additional tenfold, bringing the number to at least 2000 species. The site at which V and J sequences are joined is part of one of the hypervariable regions of each antibody polypeptide (Figure 17.15). Thus, slight differences at a joining site can have important effects on antibody–antigen interaction. We have restricted our discussion to kappa light chains for the sake of simplicity. Similar types of DNA rearrangement occur during the commitment of a cell to the synthesis of a particular lambda light chain and to a particular heavy chain. Whereas the variable regions of light chains are formed from two distinct segments (V and J segments), the variable regions of heavy chains are formed from three distinct segments (V, D, and J segments) by similar types of rearrangement. DNA rearrangement of the heavy chain locus precedes DNA rearrangement of the light chain locus. The human genome contains 51 functional VH segments, 25 DH segments, and 6 JH segments. Given the additional diversity stemming from the variability in VH–DH and DH–JH joining, a person is able to synthesize at least 100,000 different heavy chains. The antigen receptors of T cells (TCRs) also consist of a type of heavy and light chain, whose variable regions are formed by a similar process of DNA rearrangement. The formation of antibody genes by DNA rearrangement illustrates the potential of the genome to engage in dynamic activities. Because of this rearrangement mechanism, a handful of DNA sequences that are present in the germ line can give rise to a remarkable diversity of gene products. As discussed above, a person synthesizes roughly 2000 different species of kappa light chains and 100,000 different species of heavy chains. If any kappa light chain can combine with any heavy chain, a person can theoretically produce more than 200

716

that was formed previously by DNA rearrangement. Class switching is under the direction of cytokines secreted by helper T cells during their interaction with the B cell producing the antibody molecule. For example, a helper T cell that secretes IFN-␥, induces a switch in the adjacent B cell from its initial synthesis of IgM to synthesis of one of the IgG classes (Figure 17.13). Class switching allows a lineage of B cells to continue to produce antibodies having the same specificity but different effector functions (page 712).

nals to the interior that lead to changes in activity of the B cell or T cell. Each subunit of a TCR contains two Ig-like domains, indicating that they share a common ancestry with BCRs. Like the heavy and light chains of the immunoglobulins, one of the Ig-like domains of each subunit of a TCR has a variable amino acid sequence; the other domain has a constant amino acid sequence (Figure 17.20). X-ray crystallographic studies have shown that the two types of antigen receptors also share a similar three-dimensional shape.

Membrane-Bound Antigen Receptor Complexes The recognition of antigen by both B and T lymphocytes occurs at the cell surface. An antigen receptor on a B cell (a B-cell receptor, or BCR) consists of a membrane-bound immunoglobulin that binds selectively to a portion of an intact antigen (i.e., the epitope) (Figure 17.20a). In contrast, the antigen receptor on a T cell (a T-cell receptor, or TCR, Figure 17.20b) recognizes and binds to a small fragment of an antigen, typically a peptide about 7 to 25 amino acids in length, that is held at the surface of another cell (described below). Both types of antigen receptors are part of large membranebound protein complexes that include invariant proteins as depicted in Figure 17.20. The invariant polypeptides associated with BCRs and TCRs play a key role in transmitting sig-

BCR

Heavy chain

TCR β

Light chain

β

α

α

β

α

CD3

CD3

ε γ

δ ε

ζ ζ CD3

Chapter 17 The Immune Response

(a)

Signaling pathway

(b)

Figure 17.20 Structure of the antigen receptors of a B cell and a T cell. (a) The BCR of a B cell is a membrane-bound version of an immunoglobulin associated with a pair of invariant ␣ chains and a pair of invariant ␤ chains. The ␣ and ␤ chains are also members of the Ig superfamily. (b) The TCR of a T cell consists of an ␣ and ␤ polypeptide chain linked to one another by a disulfide bridge. Each polypeptide contains a variable domain that forms the antigen-binding site and a constant domain. The TCR is associated with six other invariant polypeptides of the CD3 protein as indicated in the illustration. (A small fraction of T cells contain a different type of TCR consisting of a ␥ and ␦ subunit. These cells are not restricted to recognition of MHC–peptide complexes, and their function is not well understood.)

The Major Histocompatibility Complex During the first part of the twentieth century, clinical researchers discovered that blood could be transfused from one person to another, as long as the two individuals were compatible for the ABO blood group system. The success of blood transfusion led to the proposal that skin might also be grafted between individuals. This idea was tested during World War II when skin grafts were attempted on pilots and other military personnel who had received serious burns. The grafts were rapidly and completely rejected. After the war, researchers set out to determine the basis for tissue rejection. It was discovered that skin could be grafted successfully between mice of the same inbred strain, but that grafts between mice of different strains were rapidly rejected. Mice of the same inbred strain are like identical twins; they are genetically identical. Subsequent studies revealed that the genes that governed tissue graft rejection were clustered in a region of the genome that was named the major histocompatibility complex (MHC). Approximately 20 different MHC genes have been characterized, most of which are highly polymorphic: over 7,000 different alleles of MHC (i.e., HLA) genes have been identified, far more than any other loci in the human genome. It is very unlikely, therefore, that two individuals in a population have the same combination of MHC alleles. This is the reason that transplanted organs are so likely to be rejected and why transplant patients are given drugs, such as cyclosporin A, to suppress the immune system following surgery. Cyclosporin A is a cyclic peptide produced by a soil fungus. Cyclosporin A inhibits a particular phosphatase in the signaling pathway leading to the production of cytokines required for T-cell activation. Although these drugs help prevent graft rejection, they make patients susceptible to opportunistic infections similar to those that strike people with immunodeficiency diseases such as AIDS. It is obvious that proteins encoded by the MHC did not evolve to prevent indiscriminate organ transplantation, which raises the question of their normal role. Long after their discovery as transplantation antigens, MHC proteins were shown to be involved in antigen presentation. Some of the key experiments that led to our current understanding of antigen presentation are discussed in the Experimental Pathways at the end of the chapter. It was noted earlier that T cells are activated by an antigen that has been dissected into small peptides and displayed on the surface of an antigen-presenting cell (an APC). These

717

small fragments of antigen are held at the surface of the APC in the grip of MHC proteins. Each species of MHC molecule can bind a large number of different peptides that share certain structural features that allow them to fit into its binding site (see Figure 17.24). For example, all of the peptides that are capable of binding to a protein encoded by a particular MHC allele, such as HLA-B8, may contain a specific amino acid at a certain position, which allows it to fit into the peptidebinding groove. Given the fact that each individual expresses a number of different MHC proteins (as in Figure 17.21a), and each MHC variant may be able to bind large numbers of different peptides (as in Figure 17.21b), a dendritic cell or macrophage should be able to display a vast array of peptides. At the same time, not every person is capable of presenting every possible peptide in an effective manner, which is thought to be a major factor in determining differences in susceptibility in a population to different infectious diseases, including AIDS. For example, the HLA-B*35 allele is associated with rapid progression to full-blown AIDS and the HLA-DRB1*1302 allele is correlated with resistance to a certain type of malaria and hepatitis B infection. The MHC alleles present in a given

MHC

HLA-A HLA-B

HLA-C

HLA-E

HLA-F HLA-G

(a)

(b)

Figure 17.21 Human APCs can present large numbers of peptides. (a) Schematic model of the variety of class I MHC molecules an individual might possess. This class of MHC protein is encoded by a number of genes, three of which are represented by a large number of alleles. This particular individual is heterozygous at the HLA-A, -B, and -C loci and homozygous at the HLA-E, -F, and -G loci, which gives them a total of nine different MHC class I molecules. (The difference between class I and II MHC is discussed on page 717.) (b) Schematic model illustrating the variety of peptides that can be presented by the protein encoded by a single MHC allele. (The term HLA is an acronym for human leukocyte antigen reflecting the discovery of these proteins on the surface of leukocytes.)

17.4 Selected Topics on the Cellular and Molecular Basis of Immunity

Peptide

population have been shaped by natural selection. Those persons who possess MHC alleles that are best able to present the peptides of a particular infectious agent will be the most likely to survive an infection by that agent. Conversely, persons who lack these alleles are more likely to die without passing on their alleles to offspring. As a result, populations tend to be more resistant to diseases to which their ancestors have been routinely exposed. This could explain why Native American populations have been devastated by certain diseases, such as measles, that produce only mild symptoms in persons of European ancestry. The entire process of T-cell-mediated immunity rests on the basis that small peptides derived from the proteins of a pathogen differ in structure from those derived from the proteins of the host. Consequently, one or more peptides held at the surface of an APC serve as a small representation of the pathogen, providing the cells of the immune system with a “glimpse” of the type of pathogen that is hidden within the cytoplasm of the infected cell. Nearly any cell in the body can function as an APC. Most cells present antigen as an incidental activity that alerts the immune system to the presence of a pathogen, but some professional APCs (e.g., dendritic cells, macrophages, and B cells) are specialized for this function, as discussed later in the chapter. When a T cell interacts with an APC, it does so by having its TCRs dock onto the MHC molecules projecting from the APC’s surface (Figure 17.22). This interaction brings a TCR of a T cell into an orientation that allows it to recognize the specific peptide displayed within a groove of an MHC molecule. The interaction between MHC proteins and TCRs is strengthened by additional contacts that form between cell-surface components, such as occur between CD4 or CD8 molecules on a T cell and MHC proteins on an APC (Figure 17.22). This specialized region that develops between a T cell and an APC is referred to as an immunologic synapse. MHC proteins can be subdivided into two major groups, MHC class I and MHC class II molecules. MHC class I molecules consist of one polypeptide chain encoded by an MHC allele (known as the heavy chain) associated noncovalently with a non-MHC polypeptide known as ␤2-microglobulin (see Figure 1, page 729). Differences in the amino acid sequence of the heavy chain are responsible for dramatic changes in the shape of the molecule’s peptide-binding groove. MHC class II molecules also consist of a heterodimer, but both subunits are encoded by MHC alleles. Both classes of MHC molecule as well as ␤2-microglobulin contain Ig-like domains and are thus members of the immunoglobulin superfamily. Whereas most cells of the body express MHC class I molecules on their surface, MHC class II molecules are expressed primarily by professional APCs. The two classes of MHC molecules display antigens that originate from different sites within a cell, though some overlap can occur. MHC class I molecules are predominantly responsible for displaying antigens that originate within the cytosol of a cell, that is, endogenous proteins. In contrast, MHC class II molecules primarily display fragments of exogenous antigens that are taken into a cell by phagocytosis.

718

Figure 17.22 Interaction between an antigenpresenting cell and a T cell during antigen presentation. (a) Electron micrograph of the two types of cells as they interact. (b) Schematic model showing some of the proteins present in the immunologic synapse that forms in the region of interaction between an APC and a cytotoxic T lymphocyte (CTL) or helper T (TH) cell. Antigen recognition occurs as the TCR of the T cell recognizes a peptide fragment of the antigen bound to a groove in an MHC molecule of the APC. As discussed in the text, CTLs recognize antigen in combination with an MHC class I molecule, whereas TH cells recognize antigen in combination with an MHC class II molecule. CD8 and CD4 are integral membrane proteins expressed by the two types of T cells that bind MHC class I and class II molecules, respectively. CD8 and CD4 are described as coreceptors. (Numerous other proteins

Macrophage

present in these cell-cell interaction zones are not shown; see Nat. Revs. Immunol. 11:672, 2011.) (c) Fluorescence micrograph showing the immunologic synapse between an APC and a T cell. The TCRs of the T cell appear green, the MHC II molecules of the APC appear red. The colocalization of the TCR and MHC molecules generates the yellow fluoresence in the immunologic synapse. (A: FROM ALAN S. ROSENTHAL, REGULATION OF THE IMMUNE RESPONSE—ROLE OF THE MACROPHAGE, NEW ENGLAND JOURNAL OF MEDICINE, NOVEMBER 1980, VOL. 303, #20, P. 1154 © 1980 MASSACHUSETTS MEDICAL SOCIETY; B: AFTER L. CHATENOUD, MOL. MED TODAY 4:26, 1998. MOLECULAR MEDICINE TODAY BY ELSEVIER TRENDS JOURNALS. REPRODUCED WITH PERMISSION OF ELSEVIER TRENDS JOURNALS, IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER; C: FROM B. A. COBB ET AL., CELL 117:683, 2004, FIG. 6C, REPRINTED WITH PERMISSION FROM ELSEVIER. COURTESY OF DENNIS L. KASPER.)

Lymphocyte B7 Antigen-presenting cell (APC) ICAM-1 MHC II MHC I δ ε

CD8 Cytotoxic T cell (CTL)

β

α

ε

β γ

α

Peptide

ζ ζ

TCR

CD4

δ

ε

Peptide ζ ε γ ζ

TCR T helper (TH) cell

CD28 (b)

Chapter 17 The Immune Response

(a)

?

(c)

The proposed pathways by which these two classes of MHC molecules pick up their antigen fragments and display them on the plasma membrane are described here and shown in Figure 17.23.

■

Processing of class I MHC–peptide complexes (Figure 17.23a). Antigens located in the cytosol of an APC are degraded into short peptides by proteases that are part of the cell’s proteasomes (page 541). These proteases cleave

Golgi complex

Endoplasmic reticulum

Nascent heavy chain of MHC class I molecule

Cell surface

719

1

Calnexin

5

Heavy chain

MHC class I molecule

β2m

PLC TAP

6

4

3

Calnexin + PLC

2

TAP

PLC 7

Peptides A

B

Proteasome (a)

7

4

Golgi complex Nascent class II polypeptide

Ii 5 fragment

1

6 3

+ MHC class II molecule

Ii molecule

2

A

Lysosome

B

Endocytic vesicle (b)

Figure 17.23 Classical processing pathways for antigens that become associated with MHC class I and class II molecules. (a) A proposed pathway for the assembly of an MHC class I–peptide complex. This pathway occurs in almost all cell types. In step 1, the heavy chain of the MHC protein is synthesized by a membranebound ribosome and translocated into the ER membrane. The MHC heavy chain becomes associated with calnexin (step 2), a chaperone in the ER membrane, and the dimeric complex binds the invariant ␤2m chain (step 3). The MHC complex then becomes associated with another ER membrane protein, TAP (step 4). Meanwhile, cytosolic antigens are taken into proteasomes (step A) and degraded into small peptides (step B). The peptides are transported into the ER lumen by the TAP protein, where they are trimmed to their final length by an ER peptidase (not shown). These peptides then become bound within the groove of the MHC molecule (step 5) with the help of a large complex of chaperones that is labeled PLC in the figure. PLC and calnexin dissociate from the MHC complex (step 6), which is transported along the biosynthetic/ secretory pathway through the Golgi complex (step 7) to the plasma membrane, where it is ready to interact with the TCR of a CTL. (b) A proposed pathway for the assembly of an MHC class II–peptide complex. This pathway occurs in dendritic cells and other professional APCs. In step 1, the

MHC protein is synthesized by a membrane-bound ribosome and translocated into the ER membrane, where it becomes associated with Ii (step 2), a trimeric protein that blocks the MHC–peptide-binding site. The MHC– Ii complex passes through the Golgi complex (step 3) and into a transport vesicle (step 4). Meanwhile, an extracellular protein antigen is taken into the APC by endocytosis (step A) and delivered to a lysosome (step B), where the antigen is fragmented into peptides. The lysosome containing antigenic fragments fuses with the transport vesicle containing the MHC–Ii complex (step 5), leading to the degradation of the Ii protein and association between the antigenic peptide fragment and the MHC class II molecule (step 6). The MHC–peptide complex is transported to the plasma membrane (step 7) where it is ready to interact with the TCR of a TH cell. (Note: Not all exogenous antigens follow the classical MHC class II pathway shown in part b. A pathway also exists by which exogenous antigens can be taken into APCs by endocytosis and degraded into peptides, which are then bound and displayed by MHC class I molecules. This cross-presentation pathway, as it is called, allows CTLs to become activated by exogenous antigen that would otherwise go “unseen.”) (A,B: AFTER D. B. WILLIAMS ET AL., TRENDS CELL BIOL. 6:271, 1996. TRENDS IN CELL BIOLOGY BY ELSEVIER LTD. REPRODUCED WITH PERMISSION OF ELSEVIER LTD. IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

17.4 Selected Topics on the Cellular and Molecular Basis of Immunity

Antigen fragment

720

(b)

(a)

Figure 17.24 Peptides produced by antigen processing bind within a groove of the MHC protein molecule. These models illustrate the binding of peptides to an MHC class I (a) and MHC class II (b) molecule. The molecular surfaces of the MHC molecules are shown in white and the peptide in the peptide-binding site in color. The peptide

Chapter 17 The Immune Response

■

cytosolic proteins into fragments approximately 8 to 10 residues long that are suitable for binding within a groove of an MHC class I molecule (Figure 17.24a). The peptides are then transported across the membrane of the rough endoplasmic reticulum and into its lumen by a dimeric protein called TAP (Figure 17.23a). Once in the ER lumen, the peptide can bind to a newly synthesized MHC class I molecule, which is an integral protein of the ER membrane. The MHC–peptide complex moves through the biosynthetic pathway (Figure 8.2b) until it reaches the plasma membrane where the peptide is displayed. Processing of class II MHC–peptide complexes (Figure 17.23b). MHC class II molecules are also synthesized as membrane proteins of the RER, but they are joined noncovalently to a protein called Ii, which blocks the peptidebinding site of the MHC molecule (Figure 17.23b). Following its synthesis, the MHC class II–Ii complex moves out of the ER along the biosynthetic pathway, directed by targeting sequences located within the cytoplasmic domain of Ii. MHC class I and II molecules are thought to separate from one another in the trans Golgi network (TGN), which is the primary sorting compartment along the biosynthetic pathway (page 298). An MHC class I–peptide complex is directed toward the cell surface, whereas an MHC class II–Ii complex is directed into a transport vesicle that fuses with a lysosome. There, the Ii protein is digested by acid proteases, leaving only a small fragment (CLIP) associated in the peptide-binding groove. CLIP is subsequently exchanged with one of the peptides derived from the digestion of antigens that were

in a is derived from the influenza virus matrix protein, and the peptide in b is derived from influenza virus hemagglutinin protein. The N-terminus of each peptide is on the left. (A,B: COURTESY OF T. JARDETZKY; TRENDS BIOCHEM. SCI. 22:378, 1997, FIG. 1, WITH PERMISSION FROM ELSEVIER SCIENCE.)

taken into the cell by way of the endocytic pathway (Figure 17.23b).4 The MHC class II–peptide complex is then moved to the plasma membrane where it is displayed, as shown in Figure 17.24b. Once on the surface of an APC, MHC molecules direct the cell’s interaction with different types of T cells (Figure 17.22). Cytotoxic T lymphocytes (CTLs) recognize their antigen in association with MHC class I molecules; they are said to be MHC class I restricted. Under normal circumstances, cells of the body that come into contact with CTLs display fragments of their own normal proteins in association with their MHC class I molecules. Normal cells displaying normal protein fragments are ignored by the body’s T cells because T cells capable of binding with high affinity to peptides derived from normal cellular proteins are eliminated as they develop in the thymus. In contrast, when a cell is infected, it displays fragments of viral proteins in association with its MHC class I molecules. These cells are recognized by CTLs bearing TCRs whose binding sites are complementary to the viral peptides, and the infected cell is destroyed. The presentation of a single foreign peptide on the surface of a cell is probably sufficient to invite an attack by a CTL. Because virtually all cells of the body express MHC class I molecules on their surface, CTLs can combat an infection regardless of the type of cell affected. CTLs may also recognize and destroy cells that display abnormal (mutated) proteins on their surfaces, which could play a role in the elimination of potentially life-threatening tumor cells. 4

Peptides generated in lysosomes and attached to MHC class II molecules tend to be longer (10 to 25 residues) than those generated in proteasomes and attached to MHC class I molecules (typically 8 to 10 residues).

721

In contrast to CTLs, helper T cells recognize their antigen in association with MHC class II molecules; they are said to be MHC class II restricted. As a result, helper T cells are activated primarily by exogenous (i.e., extracellular) antigens (Figure 17.23b), such as those that occur as part of bacterial cell walls or bacterial toxins. MHC class II molecules are found predominantly on B cells, dendritic cells, and macrophages. These are the lymphoid cells that ingest foreign, extracellular materials and present the fragments to helper T cells. Helper T cells that are activated in this way can then stimulate B cells to produce soluble antibodies that bind to the exogenous antigen wherever it is located in the body.

Distinguishing Self from Nonself T cells gain their identity in the thymus. When a stem cell migrates from the bone marrow to the thymus, it lacks the cellsurface proteins that mediate T-cell function, most notably its TCRs. The stem cells proliferate in the thymus to generate a population of T-cell progenitors. Each of these cells then undergoes the DNA rearrangements that enable it to produce a specific TCR. These cells are then subjected to a complex screening process in the thymus that selects for cells having potentially useful T-cell receptors (Figure 17.25). Studies suggest that epithelial cells of the thymus produce small quantities of a great variety of proteins normally restricted to other tissues throughout the body. Production of these tissuespecific antigens is under the control of a special transcriptional regulator (called AIRE) that is present only in the thymus. According to this model, the thymus re-creates an environment in which developing T cells can sample proteins containing a vast array of the body’s own unique epitopes.

Thymic cell MHC

Thymic cell MHC

MHC

X

X

Strong affinity

No affinity

Weak affinity

(negative selection)

(death by neglect)

(positive selection)

(a)

(b)

(c)

T cell

T cell

T cell

Figure 17.25 Determining the fate of a newly formed T cell in the thymus. A screening process takes place in the thymus that selects for T cells with appropriate TCRs. (a) Those T cells whose TCR exhibits strong affinity for MHC molecules bearing self-peptides are eliminated by apoptosis (negative selection). (b) Those T cells whose TCR fails to recognize MHC molecules bearing self-peptides also die by apoptosis (death by neglect). (c) In contrast, those T cells whose TCR exhibits weak affinity for MHC molecules bearing self-peptides survive (positive selection) and ultimately leave the thymus to constitute the peripheral T-cell population of the body.

5

B cells are also subjected to selective processes that lead to the death or inactivation of cells capable of producing autoreactive antibodies. In some cases, the light chains of autoreactive antibodies can be replaced by a new light chain encoded by an Ig gene that has been secondarily rearranged in a process called receptor editing.

17.4 Selected Topics on the Cellular and Molecular Basis of Immunity

Thymic cell

T cells whose TCRs have a high affinity for peptides derived from the body’s own proteins are destroyed (Figure 17.25a). This process of negative selection greatly reduces the likelihood that the immune system will attack its own tissues. Humans lacking a functional AIRE gene suffer from a severe autoimmune disease (called APECED), in which numerous organs come under immunologic attack. The generation of T cells requires more than negative selection. When a TCR interacts with a foreign peptide on the surface of an APC, it must recognize both the peptide and the MHC molecule holding that peptide (discussed on page 728). Consequently, T cells whose TCRs do not recognize selfMHC molecules are of little value. The immune system screens out such cells by requiring that T cells recognize selfMHC–self-peptide complexes with low affinity. T cells whose TCRs are unable to recognize self-MHC complexes die within the thymus due to a lack of growth signals—a process referred to as “death by neglect” (Figure 17.25b). In contrast, T cells whose TCRs exhibit weak (low affinity) recognition toward self-MHC complexes are stimulated to remain alive but are not activated (Figure 17.25c). This process of selective survival is referred to as positive selection. It is estimated that less than 5 percent of thymic T cells survive these screening events.5 Those cells that recognize class I MHC molecules are thought to develop into cytotoxic (CD4⫺CD8⫹) T lymphocytes, whereas those that recognize class II MHC molecules are thought to differentiate into helper (CD4⫹CD8⫺) T lymphocytes. Both types of T cells leave the thymus and circulate for extended periods through the blood and lymph. T cells at this stage are described as naïve T cells because they have not yet encountered the specific antigen to which their TCR can bind. As they pass through the lymphoid tissues, naïve T cells come into contact with various cells that either maintain their survival in a resting state or trigger their activation. As they percolate through the lymph nodes and other tissues, T cells scan the surfaces of cells for the presence of an inappropriate peptide bound to an MHC molecule. CD4⫹ T cells are activated by a foreign peptide bound to a class II MHC molecule, whereas CD8⫹ T cells are activated by foreign peptides bound to a class I MHC molecule. CD8⫹ T cells also respond strongly to cells bearing nonself-MHC molecules, such as the cells of a transplanted organ from a mismatched donor. In this latter case, they initiate a widespread attack against the graft cells that can result in organ rejection. Under normal physiologic conditions, autoreactive lymphocytes (i.e., lymphocytes capable of reacting to the body’s own tissues) are prevented from becoming activated by a number of poorly understood mechanisms that operate outside of the thymus in the body’s periphery. As discussed in the Human Perspective, a breakdown in these mechanisms of peripheral tolerance leads to the production of

722

autoantibodies and autoreactive T cells that can cause chronic tissue damage.

Lymphocytes Are Activated by Cell-Surface Signals Lymphocytes communicate with other cells through an array of cell-surface proteins. As discussed above, T-cell activation requires an interaction between the TCR of the T cell and an MHC–peptide complex on the surface of another cell. This interaction provides specificity, which ensures that only T cells that bind the antigen are activated. T-cell activation also requires a second signal, called the costimulatory signal, which is delivered through a second type of receptor on the surface of a T cell. This receptor is distinct and spatially separate from the TCR. Unlike the TCR, the receptor that delivers the costimulatory signal is not specific for a particular antigen and does not require an MHC molecule to bind. The best studied of these interactions occur between helper T cells and professional antigen-presenting cells (e.g., dendritic cells and macrophages). Activation of Helper T Cells by Professional APCs TH cells recognize antigen fragments on the surface of dendritic cells and macrophages that are lodged in the binding

cleft of MHC class II molecules. A costimulatory signal is delivered to a TH cell as the result of an interaction between a protein known as CD28 on the surface of the TH cell and a member of the B7 family of proteins on the surface of the APC (Figure 17.26a). The B7 protein appears on the surface of the APC after the phagocyte ingests foreign antigen. If the TH cell does not receive this second signal from the APC, rather than becoming activated, the TH cell either becomes nonresponsive (anergized) or is stimulated to undergo apoptosis (deleted). Because professional APCs are the only cells capable of delivering the costimulatory signal, they are the only cells that can initiate a TH response. As a result, normal cells of the body that bear proteins capable of combining with the TCRs of a T cell cannot activate TH cells. Thus, the requirement by a TH cell for two activation signals protects normal body cells from autoimmune attack involving TH cells. Prior to its interaction with an APC, a TH cell can be described as a quiescent cell, that is, one that has withdrawn from the cell cycle (a G0 cell, page 574). Once it receives the dual activation signals, a TH cell is stimulated to reenter the G1 phase of the cell cycle and eventually to progress through S phase into mitosis. Thus, interaction of T cells with a specific antigen leads to the proliferation (clonal expansion) of

Dendritic cell TH cell Whole antigen

B cell TH cell CD40L

CD28

MHC II

TCR TCR

B7 protein MHC II

Chapter 17 The Immune Response

BCR

CD40

Cytokine

Peptide

(a)

Figure 17.26 Lymphocyte activation. (a) Schematic drawing depicting the interaction between a professional APC, in this case a mature dendritic cell, and a TH cell. Specificity in this cell–cell interaction derives from recognition by the TCR on the TH cell of the MHC class II–peptide complex displayed on the surface of the dendritic cell. Interaction between CD28 of the T cell and a B7 protein of the dendritic cell provides a nonspecific costimulatory signal that is required for T-cell activation. (Inset) Scanning electron micrograph of a mature dendritic cell (orange) presenting antigen to a number of T cells (green). (b) Schematic drawing of the interaction between an activated TH cell and a B cell. Specificity in this cell–cell interaction

Lysosome Cytokine receptor

Antigen fragment

(b)

derives from recognition by the TCR on the TH cell of the MHC class II–peptide complex displayed on the surface of the B cell. The peptide displayed by the B cell is derived from protein molecules that had initially bound to the BCRs at the cell surface. These bound antigens are taken up by endocytosis, fragmented in lysosomes, and bound to MHC class II molecules, as in Figure 17.23b. (A,B: E. LINDHOUT ET AL., IMMUNOL TODAY 18:574, 1997. IMMUNOLOGY TODAY BY ELSEVIER SCI LTD/U. K. REPRODUCED WITH PERMISSION OF ELSEVIER SCI LTD/U. K. IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER. INSET PHOTO; FROM PETER FRIEDL, NATURE REVS. IMMUNOLOGY 5:533, 2005, FIG. 1A, MIDDLE IMAGE; REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LIMITED.)

723

those cells capable of responding to that antigen. In addition to triggering cell division, activation of a TH cell causes it to synthesize and secrete cytokines (most notably IL-2). Cytokines produced by activated TH cells act on other cells of the immune system (including B cells and macrophages) and also back on the TH cells that secreted the molecules. The source and function of various cytokines were indicated in Table 17.1. We have seen in this chapter how immune responses are stimulated by ligands that activate receptor signaling pathways. But many of these events are also influenced by inhibitory stimuli, so that the ultimate response by the cell is determined by a balance of positive and negative influences. For example, interaction between CD28 and a B7 protein delivers a positive signal to the T cell leading to its activation. Activation of the T cell leads to the trafficking of a protein called CTLA4 from intracellular membranes to the cell surface. CTLA4 is similar to CD28 in structure and also interacts with B7 proteins of the APC. Unlike the CD28–B7 interaction, however, contact between CTLA4 and B7 leads to inhibition of the T cell’s response, rather than activation. CTLA4 is thought to play a role in maintaining peripheral T cell tolerance. The need for balance between activation and inhibition is most evident in mice that have been genetically engineered to lack the gene encoding CTLA4. These mice die as a result of massive overproliferation of T cells. As noted on page 726, insights into CTLA4 function have recently been put to clinical use.

Signal Transduction Pathways in Lymphocyte Activation We saw in Chapter 15 how hormones, growth factors, and other chemical messengers bind to receptors on the outer surfaces of target cells to initiate a process of signal transduction, which transmits information into a cell’s internal compartments. We saw in that chapter how a large variety of extracellular messenger molecules transmit information along a small number of shared signal transduction pathways. The stimulation of lymphocytes occurs by a similar mechanism and utilizes many of the same components employed by hormones and growth factors that act on other cell types. When a T cell is activated by a dendritic cell, or a B cell is activated by a TH cell, signals are transmitted from plasma membrane to cytoplasm by tyrosine kinases, similar to the signals described in Chapter 15 for insulin and growth factors. Unlike the receptors for insulin and growth factors (page 638), lymphocyte antigen receptors lack an inherent tyrosine kinase activity. Instead, ligand binding to antigen receptors leads to the recruitment of cytoplasmic tyrosine kinase molecules to the inner surface of the plasma membrane. This process may be facilitated by the movement of activated receptors into lipid rafts (page 139). Several different tyrosine kinases have been implicated in signal transduction during lymphocyte activation, including members of the Src and Tec families. Src was the first tyrosine kinase to be identified and is the product of the first cancer-causing oncogene to be discovered (page 696). Activation of these tyrosine kinases leads to a cascade of events and the activation of numerous signal transduction pathways, including: 1. Activation of phospholipase C, which leads to the forma-

tion of IP3 and DAG. As discussed on page 630, IP3 causes a marked elevation in the levels of cytosolic Ca2⫹, whereas DAG stimulates protein kinase C activity. 2. Activation of Ras, which leads to the activation of the MAP kinase cascade (page 640). 3. Activation of PI3K, which catalyzes the formation of membrane-bound lipid messengers having diverse functions in cells (page 645). Transmission of signals along these various pathways, and others, leads to the activation of a number of transcription factors (e.g., NF-␬B and NFAT) and the resulting transcription of dozens of genes that are not expressed in the resting T cell or B cell. As noted above, one of the most important responses by an activated lymphocyte is the production and secretion of cytokines. Some of these cytokines may act as part of an autocrine circuit by binding to receptors on the cell that released them. Like other extracellular signals, cytokines bind to receptors on the outer surface of target cells, generating cytoplasmic signals that act on various intracellular targets. Cytokines

17.4 Selected Topics on the Cellular and Molecular Basis of Immunity

Activation of B Cells by TH Cells TH cells bind to B cells whose receptors recognize the same antigen. The antigen initially binds to the immunoglobulin (BCR) at the B-cell surface. Antigens taken up by B cells can be soluble proteins from the extracellular medium or proteins bound to the plasma membrane of other cells. In the latter case, the B cell obtains the antigen by spreading itself over the outer surface of the target cell and collecting the BCR-antigen complexes into a central cluster. The bound antigen is then taken into the B cell, where it is processed enzymatically, and its fragments are displayed in combination with MHC class II molecules (Figure 17.26b). The B cell with displayed antigenic peptides recruits a TH cell with the appropriate (cognate) TCR. Recognition of the peptide fragment by the TCR leads to activation of the TH cell, which responds by activating the B cell. Activation of a B cell follows transmission of several signals from the TH cell to the B cell. Some of these signals are transmitted directly from one cell surface to the other through an interaction between complementary proteins, such as CD40 and CD40 ligand (CD40L) (Figure 17.26b). Binding between CD40 and CD40L generates signals that help move the B cell from the G0 resting state back into the cell cycle. Other signals are transmitted by cytokines released by the T cell into the immunologic synapse separating it from the nearby B cell. This process is not unlike the way neurotransmitters act across a neural synapse (page 169). Cytokines released by the T cells into the immunologic synapse include IL-4, IL-5, IL-6, and IL-10. Interleukin 4 is thought to stimulate the B cell to switch from producing the IgM class to producing the IgG or

IgE class. Other cytokines induce the proliferation, differentiation, and secretory activities of B cells.

724

utilize a novel signal transduction pathway referred to as the JAK–STAT pathway, which operates without the involvement of second messengers. The “JAK” portion of the name is an acronym for Janus kinases, a family of cytoplasmic tyrosine kinases whose members become activated following the binding of a cytokine to a cell-surface receptor. ( Janus is a two-faced Roman god who protected entrances and doorways.) “STAT” is an acronym for “signal transducers and activators of transcription,” a family of transcription factors that become activated when one of their tyrosine residues is phosphorylated by a JAK (see Figure 15.19c). Once phosphorylated, STAT molecules interact to form dimers that translocate from the cytoplasm to the nucleus where they bind to specific DNA sequences, such as an interferon-stimulated response element (ISRE). ISREs are found in the regulatory regions of a dozen or so genes that are activated when a cell is exposed to the cytokine interferon ␣ (IFN-␣). As with the hormones and growth factors discussed in Chapter 15, the specific response of a cell depends on the particular cytokine receptor engaged and the particular JAKs and STATs that are present in that cell. For example, it was noted earlier that IL-4 acts to induce Ig class switching in B cells. This response follows the IL-4-induced phosphorylation of the transcription factor STAT6, which is present in the cytoplasm of activated B cells. Resistance to viral infection induced by interferons (page 703) is mediated through the phosphorylation of STAT1. Phosphorylation of other STATs may lead to the progression of the target cell through the cell cycle.

T H E

H U M A N

REVIEW 1. Draw the basic structure of an IgG molecule bound to an epitope of an antigen. Label the heavy and light chains; the variable and constant regions of each chain; the regions that would contain hypervariable sequences. 2. Draw the basic arrangement of the genes of the germ line that are involved in encoding the light and heavy chains of an IgG molecule. How does this differ from their arrangement in the genome of an antibodyproducing cell? What steps occur to bring about this DNA rearrangement? 3. Give three different mechanisms that contribute to variability in the V regions of antibody chains. 4. Compare and contrast the structure of antigen receptors on B and T cells. 5. Describe the steps in processing a cytosolic antigen in an APC. What is the role of MHC proteins in this process? 6. Compare and contrast the roles of an MHC class I and class II protein molecule. What types of APCs utilize each class of MHC molecule, and what types of cells recognize them? 7. Describe the steps between the stage when a bacterium is ingested by a macrophage and the stage when plasma cells are producing antibodies that bind to the bacterium and neutralize its infectivity.

P E R S P E C T I V E

Chapter 17 The Immune Response

Autoimmune Diseases The immune system requires complex and highly specific interactions between many different types of cells and molecules. Numerous events must take place before a humoral or cell-mediated immune response can be initiated, which makes these processes vulnerable to disruption at various stages by numerous factors. Included among the various types of immune dysfunction are autoimmune diseases, which result when the body mounts an immune response against part of itself. More than 80 different autoimmune diseases have been characterized, affecting approximately 5 percent of the population. Because the specificity of the antigen receptors of both T and B cells is determined by a process of random gene rearrangement, it is inevitable that some members of these cell populations possess receptors directed against the body’s own proteins (self-antigens). Lymphocytes that bind self-antigens with high affinity tend to be removed from the lymphocyte population, making the immune system tolerant toward self. However, some of the self-reactive lymphocytes generated in the thymus and bone marrow escape the body’s negative selection processes, giving them the potential to attack normal body tissues. The presence of B and T lymphocytes capable of reacting against the body’s tissues is readily demonstrated in healthy individuals. For example, when T cells are isolated from the blood and treated in vitro with a normal self-protein, together with the cytokine IL-2, a small number of cells in the population are likely to

proliferate to form a clone of cells that can react to that self-antigen. Similarly, if laboratory animals are injected with a purified selfprotein together with an adjuvant, which is a nonspecific substance that enhances the response to the injected antigen, they mount an immune response against the tissues in which that protein is normally found. Under normal circumstances, B and T cells capable of reacting to self-antigens are inhibited by antigen-specific TReg cells (page 710) or by other suppressive mechanisms. When these mechanisms fail, a person may suffer from an autoimmune disease, including those described below.

1. Multiple sclerosis (MS) is a chronic inflammatory disease that typically strikes young adults, causing severe and often progressive neurological damage. MS results from an attack by immune cells and/or antibodies against the myelin sheath that surrounds the axons of nerve cells (page 167). These sheaths form the white matter of the central nervous system. The demyelination of nerves that results from this immunologic attack interferes with the conduction of nerve impulses along axons, leaving the person with diminished eyesight, problems with motor coordination, and disturbances in sensory perception. A disease similar to MS, called experimental allergic encephalomyelitis, can be induced in laboratory animals by

725

2.

3.

5.

organ but often attacks tissues throughout the body, including the central nervous system, kidneys, and heart. The serum of patients with SLE contains antibodies directed against a number of components that are found in the nuclei of cells, including small nuclear ribonucleoproteins (snRNPs); proteins of the centromeres of chromosomes; and, most notably, doublestranded DNA. Recent studies suggest that autoimmunity occurs when TLRs that normally recognize microbial DNA and RNA (page 702) mistakenly bind to the body’s own informational macromolecules. The incidence of SLE is particularly high in women of child-bearing age, suggesting a role for female hormones in triggering the disease. 6. Inflammatory bowel diseases (IBDs), such as Crohn’s disease and ulcerative colitis, are characterized by painful inflammation of the intestine. Accumulating evidence suggests that these conditions result from an inappropriate response by the immune system to the normal commensal bacteria that inhabit our digestive system. Genome-wide association studies (page 418) have linked more than 50 genetic loci with susceptibility to these diseases. Not everyone in the population is equally susceptible to developing one of these autoimmune diseases. Most of these disorders appear much more frequently in certain families than in the general population, indicating a strong genetic component to their development. Though many different genes have been shown to increase susceptibility to autoimmune diseases, those that encode MHC class II polypeptides are most strongly linked. For example, people who inherit certain alleles of the MHC locus are particularly susceptible to developing type 1 diabetes. It is thought that cells bearing MHC molecules encoded by the susceptible alleles can bind particular peptides that stimulate the formation of autoantibodies against insulin-secreting ␤ cells of the pancreas. Other loci that are correlated with increased risk for autoimmune diseases include genes that encode proteins that are involved in T-cell signaling pathways, such as the protein tyrosine phosphatase PTPN22, and genes that encode certain pro-inflammatory cytokines or their receptors. Possession of high-risk alleles may be necessary for an individual to develop certain autoimmune diseases, but it is not the only contributing factor. Studies of identical twins indicate that if one twin develops an autoimmune disease, the likelihood that the other twin will also develop the disease ranges from about 25 to 75 percent, not 100 percent as expected if genetics were the only contributing factor. Identifying the specific environmental or epigenetic factors (page 509) that contribute to the development of these diseases has proven even more elusive than identification of the predisposing genes. The great progress that has been made over the past two decades in our understanding of the cellular and molecular basis of immunity has led to new treatments for a number of autoimmune diseases that involve manipulation of the body’s immune system. These treatments have been tested in animal models (i.e., animals that can be made to develop diseases similar to those of humans) and their safety and efficacy determined in human clinical trials. Among the approaches taken are: ■

Figure 1 A “butterfly” rash that often appears as one of the early symptoms of SLE. (COURTESY OF LUPUS FOUNDATION OF AMERICA, INC.)

■

Treatment with immunosuppressive drugs, such as cyclosporin A or CellCept, that block the autoimmune response. Corticosteroids, such as prednisone, are also prescribed for short periods. Because these drugs are nonspecific, they inhibit all types of immune responses and, therefore, render the patient susceptible to dangerous infections. Restoring immunological tolerance to self-antigens so that the body stops producing autoantibodies and autoreactive T cells. Of

17.4 Selected Topics on the Cellular and Molecular Basis of Immunity

4.

injection of myelin basic protein, a major component of the myelin plasma membrane. However, questions have been raised about the validity of this and other mouse models of human autoimmune diseases because agents that have shown efficacy in treating these affected mice often do not translate into effective treatments for humans. Type 1 diabetes (T1D) typically arises in children and results from the autoimmune destruction of the insulin-secreting ␤ cells of the pancreas. Destruction of these cells is mediated by self-reactive T cells, whose proliferation may be stimulated by a viral infection. At present, patients with T1D are administered daily doses of insulin. While the hormone allows them to survive, these individuals may still be subject to degenerative kidney, vascular, and retinal disease. Patients typically possess a significant fraction of their pancreatic ␤ cells (e.g., 10–20 percent) at the time they are diagnosed with the disease. It is hoped that treatments can be developed that will stop further loss of these cells and possibly even increase the number of these cells in their pancreas. As with MS, a well-studied mouse model for T1D, called NOD (non-obese diabetic) mice, is often used to test drugs in preclinical studies. Graves’ disease and thyroiditis are autoimmune diseases of the thyroid that produce very different symptoms. In Graves’ disease, the target of immune attack is the TSH receptor on the surface of thyroid cells that normally binds the pituitary hormone thyroid-stimulating hormone (TSH). In patients with this disease, autoantibodies bind to the TSH receptor, causing the prolonged stimulation of thyroid cells, leading to hyperthyroidism (i.e., elevated blood levels of thyroid hormone). Thyroiditis (or Hashimoto’s thyroiditis) develops from an immune attack against one or more common proteins of thyroid cells, including thyroglobulin. The resulting destruction of the thyroid gland leads to hypothyroidism (i.e., decreased blood levels of thyroid hormone). Rheumatoid arthritis affects approximately one percent of the population and is characterized by the progressive destruction of the body’s joints due to a cascade of inflammatory responses. In a normal joint, the synovial membrane that lines the synovial cavity is only a single cell layer thick. In persons with rheumatoid arthritis, this membrane becomes inflamed and thickened due to the infiltration of autoreactive immune cells and/or autoantibodies into the joint. Over time, the cartilage is replaced by fibrous tissue, which causes immobilization of the joint. Systemic lupus erythematosus (SLE) gets its name “red wolf ” from a reddish rash that develops on the cheeks during the early stages of the disease (Figure 1). Unlike the other autoimmune diseases discussed above, SLE is seldom confined to a particular

726

Chapter 17 The Immune Response

■

■

all of the approaches discussed in this section, this is the only one that promises the potential for antigen-specific therapy that targets a specific population of autoreactive immune cells; all of the others exert a nonspecific influence on the immune system. One way to induce tolerance to specific antigens is to administer peptides (called APLs) that resemble the peptides that would be generated from the self-antigens responsible for causing the disease. It is hoped that such APLs would bind to TCRs in a suboptimal manner, blocking T-cell activation and reducing the secretion of inflammatory cytokines (e.g., TNF-␣ and IFN-␣). One drug of this type (Copaxone) consists of a mixture of synthetic peptides whose structure resembles that of myelin basic protein. Copaxone remains the only antigen-specific therapy for autoimmune diseases to be approved to date but a number of similar types of peptide “vaccines” are in clinical trials. While Copaxone may reduce the frequency of relapses in patients with multiple sclerosis, it does not stop disease progression and may elicit severe allergic side effects. Another approach to inducing immunological tolerance to myelinderived proteins in patients with MS has been to isolate autoreactive T cells from the patient, expand their numbers in culture, render them incapable of replication, and inject them back into the patient with the aim of inducing an immune reaction against the reintroduced cells and other autoreactive cells in the body. Phase III trials of this patient-specific, anti-T cell vaccine (Tovaxin) began in 2011. Yet another way to induce immunological tolerance is to administer a modified version of the protein CTLA4, which binds to the B7 costimulatory protein on the surface of APCs and inhibits these cells’ ability to activate autoreactive T cells. Orencia, which acts in this manner, has been approved for patients with rheumatoid arthritis. Many researchers believe that the best long-term, antigen-specific approach to reestablishing tolerance is to treat a person with his or her own TReg cells. In this approach, the desired TReg cells would be isolated from the patient’s blood, allowed to proliferate extensively in vitro, and then reinjected back into the patient in an attempt to suppress the specific self-reactive immune cells that are waging the autoimmune attack. Several clinical trials involving the transfer of TReg cells have begun in patients who are experiencing immunological complications following organ transplantation. Blocking the effect of pro-inflammatory cytokines, which are among the agents that wreak the greatest tissue destruction in many autoimmune diseases. The best studied examples of this approach are antibodies (e.g., Remicade, Cimzia. Simponi, and Humira) and a recombinant fusion protein (Enbrel) that act against the pro-inflammatory cytokine TNF-␣. These drugs have been approved for the treatment of rheumatoid arthritis and can have dramatic curative effects in many patients. At the same time, blocking the action of TNF-␣ can increase the risk of infections (including tuberculosis) and lymphoma. IL-1 is another key pro-inflammatory cytokine, and a number of IL-1 inhibitors are presently in the drug pipeline. Kineret, a recombinant protein that blocks the activity of IL-1 by binding to the IL-1 receptor, has been in use for over a decade for the treatment of rheumatoid arthritis. Similarly, a human monoclonal antibody (Stelara) directed against IL-12 and IL-23 shows efficacy against Crohn’s disease and psoriasis. Treatment with cytokines. At present, the most widely prescribed treatment for multiple sclerosis is one of several approved IFN-␤ cytokines (Avonex, Rebif, Extavia and Betaseron), which reduce disease progression by an average of 35 percent. IFN-␤ has many activities, and the precise mechanism of action is debated.

■

■

■

Treatment with agents that destroy B cells or block their activation. Although autoimmune diseases are generally thought to result from the dysfunction of T cells, particularly helper T cells, there is considerable evidence that self-reactive B cells have several roles to play. In addition to producing antibodies, B cells can serve as APCs for antigen-specific T cells and they secrete a variety of cytokines. The importance of B cells in autoimmune disorders has been confirmed in clinical studies with two monoclonal antibodies (Rituxan and Arzerra) that bind to and destroy B cells. Rituxan had been used as a safe and successful treatment for lymphoma (page 688) before it was tested in patients with rheumatoid arthritis and multiple sclerosis, where it has proved surprisingly effective. It is remarkable that even though these antibody-based drugs virtually deplete the blood of circulating B cells for a period of several months, they do not limit a patient’s ability to mount immune responses against infectious agents. In 2011 an antibody (Benlysta) that indirectly inhibits B cell production was approved for the treatment of SLE—the first new drug to be approved for treatment of the disease in 50 years. Benlysta acts by targeting BLyS, a protein that stimulates B cell proliferation and differentiation, and provides benefit to a fraction of patients with SLE. Disrupting the movement of self-reactive immune cells to areas of inflammation. The first effective treatment based on this concept to be approved by the FDA was a monoclonal antibody (Tysabri) that is directed against the integrin subunit ␣4 present on the surface of activated T cells. The drug is aimed at preventing these T cells from crossing the blood-brain barrier (page 262) and attacking the myelin sheaths of the central nervous system of MS patients. There is always a concern with any type of immune therapy that the treatment will interfere with the body’s ability to fight infection. This concern was realized when Tysabri was temporarily removed from the market in 2005 after several patients developed a serious viral brain infection (called PML). At the present time, Tysabri is widely prescribed to treat MS, although the small risk of contracting PML remains a concern. In 2010 the first MS drug that could be taken orally, called Gilenya, was approved by the FDA based on its ability to reduce relapses and delay the progression of the disease. Like Tysabri, Gilenya also blocks the migration of T cells but does so by a very different mechanism. Gilenya is a small synthetic compound that, once phosphorylated by the body, binds to sphingosine 1-phosphate receptors on the surface of T cells, causing the receptors to be internalized by endocytosis. Cells lacking the surface receptors cannot respond to their normal ligand, which would promote their migration out of the thymus and lymph nodes and into the bloodstream, from where they could reach the central nervous system. Several other orally administered MS drugs are currently in late-stage clinical trials. Transplantation of hematopoietic stem cells (page 20) from either the patient themselves (an autologous transplant) or a closely matched donor (an allogeneic transplant). Because this procedure has the potential to cause life-threatening complications, it represents a treatment for only those patients with severe and debilitating autoimmune disease. However, unlike the drugs described above, transplant recipients begin the rest of their lives with a dramatically altered immune system and the possibility of being completely cured of their disease. It is estimated that about one-third of transplant recipients experience dramatic long-term benefits from the procedure while another third obtain no apparent benefit at all. The reasons for this marked discrepancy in response remain unclear.

727

E X P E R I M E N TA L

P AT H W AY S

The Role of the Major Histocompatibility Complex in Antigen Presentation

Table 1 Cytotoxic Activity of Spleen Cells from Various Strains of Mice Injected 7 Days Previously with LCM Virus for Monolayers of LCM-Infected or Normal C3H (H-2k ) Mouse L Cells % 51Cr release Exp

1

2

3

Mouse strain

H-2 type

Infected

Normal

CBA/H Balb/C CB57B1 CBA/H ⫻ C57B1 CB57Bl ⫻ Balb/C nu/⫹ or ⫹/⫹ nu/nu CBA/H AKR DBA/2 CBA/H C3H/HeJ

k d b k/b b/d

65.1 ⫾ 3.3 17.9 ⫾ 0.9 22.7 ⫾ 1.4 56.1 ⫾ 0.5 24.8 ⫾ 2.4 42.8 ⫾ 2.0 23.3 ⫾ 0.6 85.5 ⫾ 3.1 71.2 ⫾ 1.6 24.5 ⫾ 1.2 77.9 ⫾ 2.7 77.8 ⫾ 0.8

17.2 ⫾ 0.7 17.2 ⫾ 0.6 19.8 ⫾ 0.9 16.7 ⫾ 0.3 19.8 ⫾ 0.9 21.9 ⫾ 0.7 20.0 ⫾ 1.4 20.9 ⫾ 1.2 18.6 ⫾ 1.2 21.7 ⫾ 1.7 25.7 ⫾ 1.3 24.5 ⫾ 1.5

k k d k k

Reprinted with permission from R. M. Zinkernagel and P. C. Doherty; Nature 248:701, 1974; copyright 1974, Nature by Nature Publishing Group. Reproduced with permission of Nature Publishing Group in the format reuse in a book/ textbook via Copyright Clearance Center.

were prepared from mice having an H-2k allele (e.g., CBA/H, AKR, and C3H/HeJ strains of mice), the L cells were effectively lysed. However, spleen cells taken from mice bearing H-2b or H-2d alleles at this locus were unable to lyse the infected fibroblasts. (The 51Cr released is approximately the same when noninfected fibroblasts are used in the assay, as shown in the right column of Table 1.) It was essential to show that the results were not peculiar to mice bearing H-2k alleles. To make this determination, Zinkernagel and Doherty tested LCMV-activated spleen cells from H-2b mice against various types of infected cells. Once again, the CTLs would only lyse infected cells having the same H-2 genotype, in this case H-2b. These studies provided the first evidence that the MHC molecules on the surface of an infected cell restricts its interactions with T cells. T-cell function is said to be MHC restricted. These and other experiments during the 1970s raised questions about the role of MHC proteins in immune cell function. Meanwhile, another line of investigation was focusing on the mechanism by which T cells were stimulated by particular antigens. Studies had indicated that T cells respond to antigen that is bound to the surface of other cells. It was presumed that the antigen being displayed had simply bound to the surface of the antigen-presenting cell (APC) from the extracellular medium. During the mid-1970s and early 1980s, studies by Alan Rosenthal at NIH, Emil Unanue at Harvard University, and others demonstrated that antigen had to be internalized by the APC and subjected to some type of processing before it could stimulate T-cell proliferation. Most of these studies were carried out in cell culture utilizing T cells activated by macrophages that had been previously exposed to bacteria, viruses, or other foreign material. One of the ways to distinguish between antigen that is simply bound to the surface of an APC and antigen that has been processed by metabolic activities is to compare events that can occur at low temperatures (e.g., 4⬚C) where metabolic processes are blocked with those that occur at normal body temperatures. In one of these earlier experiments, macrophages were incubated with antigen for one hour at either 4⬚C or 37⬚C and then tested for their ability to stimulate T cells prepared from lymph nodes.3 At lower antigen concentrations, macrophages were nearly 10 times more effective at stimulating T cells at 37⬚C than at 4⬚C, suggesting that antigen processing requires metabolic activities. Treatment of cells with sodium azide, a cytochrome oxidase inhibitor, also inhibited the appearance of antigen on the surface of T cells, indicating that antigen presentation requires metabolic energy.4 Subsequent experiments by Kirk Ziegler and Emil Unanue provided evidence that processing occurred as extracellular antigens were taken into the macrophage by endocytosis and delivered to the cell’s lysosomal compartment.5 One approach to determining whether lysosomes are involved in a particular process is to treat the cells with substances, such as ammonium chloride or chloroquine, that disrupt lysosomal enzyme activity. Both of these agents raise the pH of the lysosomal compartment, which inactivates the acid hydrolases (page 304). Table 2 shows the effects of these treatments on processing and presentation of antigen derived from the bacteria Listeria monocytogenes. It can be seen from the table that neither of these substances affected the uptake (endocytosis) of antigen, but both substances markedly inhibited the processing of the antigen and its ability to stimulate binding of T cells to the macrophage. These data were among the first to suggest that fragmentation of extracellular antigens by lysosomal proteases may be an essential step in preparation of extracellular antigens prior to presentation.

17.4 Selected Topics on the Cellular and Molecular Basis of Immunity

In 1973, Hugh McDevitt and his colleagues at the Scripps Foundation in La Jolla, California, and at Stanford University, demonstrated that the susceptibility of mice to a particular pathogen depends on the allele present at one of the MHC loci.1 They found that lymphocytic choriomeningitis virus (LCMV) causes a lethal brain infection in mice that are either homozygous or heterozygous for the H-2q allele but does not cause infections in mice that are homozygous for the H-2k allele at this locus. These findings led Rolf Zinkernagel and Peter Doherty of the Australian National University to examine the role of cytotoxic T lymphocytes (CTLs) in the development of this disease. Zinkernagel and Doherty planned experiments to correlate the level of CTL activity with the severity of the disease in mice having different MHC genotypes (or haplotypes, as they are called). Cytotoxic T lymphocyte activity was monitored using the following experimental protocol. Monolayers of fibroblasts (L cells) from one mouse were grown in culture and subsequently infected with the LCM virus. The infected fibroblasts were then overlaid by a preparation of spleen cells from a mouse that had been infected with the LCM virus seven days earlier. Waiting for seven days gives the animal’s immune system time to generate CTLs against virus-infected cells. The CTLs become concentrated in the infected animal’s spleen. To monitor the effectiveness of the attack by the CTLs on the cultured L cells, the L cells were first labeled with a radioisotope of chromium (51Cr). 51Cr is used as a marker for cell viability: as long as a cell remains alive, the radioisotope remains inside the cell. If a cell should be lysed during the experiment by a CTL, the 51Cr is released into the medium. Zinkernagel and Doherty found that the level of CTL activity against the cultured fibroblasts—as measured by the release of 51 Cr—depended on the relative genotypes of the fibroblasts and spleen cells (Table 1).2 All of the fibroblasts used to obtain the data shown in Table 1 were taken from an inbred strain of mice that were homozygous for the allele H-2k at the H-2 locus. When spleen cells

728

Table 2 Inhibition of Antigen Presentation with NH4Cl and Chloroquine 10 mM NH4Cl Assay

Antigen uptake Antigen ingestion Antigen catabolism T cell-macrophage binding before antigen handling after antigen handling

Control (%)

Observed (%)

0.1 mM Chloroquine

Inhibition (%)

Observed (%)

Inhibition (%)

15 ⫾ 1 66 ⫾ 2 29 ⫾ 4

13 ⫾ 2 63 ⫾ 2 13 ⫾ 3

13 5 55

15 ⫾ 2 67 ⫾ 6 14 ⫾ 6

0 ⫺2 52

70 ⫾ 7 84 ⫾ 8

26 ⫾ 8 70 ⫾ 11

63 17

30 ⫾ 8 60 ⫾ 10

57 24

Chapter 17 The Immune Response

Source: H. K. Ziegler and E. R. Unanue, Proc. Nat’l. Acad. Sci. U.S.A. 79:176, 1982.

Other studies continued to implicate MHC molecules in the interaction between APCs and T cells. In one series of experiments, Ziegler and Unanue treated macrophages with antibodies directed against MHC proteins encoded by the H-2 locus. They found that these antibodies had no effect on the uptake or catabolism of antigen,6 but markedly inhibited the macrophages from interacting with T cells.7 Inhibition of binding of T cells to macrophages was obtained even when macrophages were exposed to the antibodies before addition of antigen. Evidence from these and many other studies indicated that the interaction between a T cell and a macrophage depended on the recognition of two components on the surface of the antigenpresenting cell: the antigen fragment being displayed and an MHC molecule. But there was no clear-cut picture as to how the antigen fragment and MHC molecule were related. Two models of antigen recognition were considered likely possibilities. According to one model, T cells possess two distinct receptors, one for the antigen and another for the MHC protein. According to the other model, a single T-cell receptor recognizes both the MHC protein and the antigen peptide on the APC surface simultaneously. The balance of opinion began to shift in favor of the one-receptor model as evidence began to point to a physical association between MHC proteins and displayed antigens. In one study, for example, it was shown that antigen that had been processed by T cells could be isolated as a complex with MHC proteins.8 In this experiment, cultured T cells taken from H-2k mice were incubated with radioactively labeled antigen for 40 minutes. After the incubation period, processed antigen was prepared from the cells and passed through a column containing beads coated with antibodies directed against MHC proteins. When the beads were coated with antibodies against H-2k protein, an MHC molecule present in the T cells, large amounts of radioactive antigen adhered to the beads, indicating the association of the processed antigen with the MHC protein. If the beads were instead coated with antibodies against H-2b protein, an MHC protein that was not present in the T cells, relatively little radioactive antigen remained in the column. Following these early experiments, investigators turned their attention to the atomic structure of the molecules involved in T-cell interactions, which lies at the heart of the immune response. Rather than utilizing MHC class II molecules on the surfaces of macrophages, structural studies have examined MHC class I molecules of the type found on the surfaces of virally infected cells. The first three-dimensional portrait of an MHC molecule was published in 1987 and was based on X-ray crystallographic studies by Don Wiley and colleagues at Harvard University.9 The events leading up to this discovery are discussed in Reference 10. MHC class I molecules consist of (1) a heavy chain containing three extracellular domains (␣1, ␣2, and ␣3) and a single membrane-spanning segment and (2) an invariant ␤2m polypeptide (see Figure 17.23). Wiley and col-

leagues examined the structure of the extracellular (soluble) portion of the MHC molecule (␣1, ␣2, ␣3, and ␤2m) after removing the transmembrane anchor. A ribbon model of the observed structure is shown in Figure 1a, with the outer (antigen-bearing) portion of the protein constructed from the ␣1 and ␣2 domains. It can be seen from the top portion of this ribbon model that the inner surfaces of these domains form the walls of a deep groove approximately 25 Å long and 10 Å wide. It is this groove that acts as the binding site for peptides produced by antigen processing in the cytoplasm. As shown in Figure 1b, the sides of the antigen-binding pocket are lined by ␣ helices from the ␣1 and ␣2 domains, and the bottom of the pocket is lined by ␤ sheet that extends from these same domains across the midline. The helices are thought to form relatively flexible side walls enabling peptides of different sequence to bind within the groove. Subsequent X-ray crystallographic studies described the manner in which peptides are positioned within the MHC antigenbinding pocket. In one of these studies, the spatial arrangement of several naturally processed peptides situated within the antigenbinding pocket of a single MHC class I molecule (HLA-B27) was determined.11 The backbones of all the peptides bound to HLAB27 share a single, extended conformation running the length of the binding cleft. The N- and C-termini of the peptides are precisely positioned by numerous hydrogen bonds at both ends of the cleft. The hydrogen bonds link the peptide to a number of conserved residues in the MHC molecule that are part of both the sides and bottom of the binding groove. In another key study, Ian Wilson and colleagues at the Scripps Research Institute in La Jolla, California, reported on the X-ray crystallographic structure of a mouse MHC class I protein complexed with two peptides of different length.12,13 The overall structure of the mouse MHC protein is similar to that of the human MHC protein shown in Figure 1a. In both cases, the peptides are bound in an extended conformation deep within the binding groove of the MHC molecule (Figure 2). This extended conformation allows for numerous interactions between the side chains of the MHC molecule and the backbone of the bound peptide. Because the MHC interacts primarily with the backbone of the peptide rather than its side chains, there are very few restrictions on the particular amino acid residues that can be present at various sites of the binding pocket. As a result, each MHC molecule can bind a diverse array of antigenic peptides. An MHC–peptide complex projecting from the surface of an infected cell is only half the story of immunologic recognition; the other half is represented by the T-cell receptor (TCR) projecting from the surface of the cytotoxic T cell. It has been evident for more than a decade that a TCR can somehow recognize both an MHC and its contained peptide, but the manner by which this occurs eluded researchers because of difficulties in preparing protein crystals of the TCR that were suitable for X-ray crystallography. These

729

␣1

␣2

N N

C

ß2m

C N

␣3

(a)

(b)

(b) Schematic representation of the peptide-binding pocket of the MHC protein as viewed from the top of the molecule. The bottom of the pocket is lined by ␤ sheet (orange-purple arrows) and the walls by ␣ helices (green). The ␣1 domain is shown in orange and light green; the ␣2 domain is shown in purple and dark green. (FROM PAMELA J. BJORKMAN ET AL., NATURE 329:508, 509, 1987, FIG. 2A, 2B; REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LTD.)

difficulties were eventually overcome, and in 1996, reports were published by both the Wiley and Wilson laboratories that provided a three-dimensional portrait of the interaction between an MHC–peptide and TCR.14,15 The overall structure of the complex formed between the two proteins is shown in Figure 3, where the backbones of the polypeptides are portrayed as tubes. The structure shown in Figure 3 depicts the portions of the proteins that project between a CTL and virally infected host cell. The lower half of the image shows the structure and orientation of the MHC class I molecule with the extended peptide antigen (yellowgreen) embedded within the protein’s binding pocket. The upper half of the image shows the structure and orientation of the TCR. As indicated in Figure 17.20b, a TCR consists of ␣ and ␤ polypeptide chains, each chain comprised of a variable (V) and constant (C) portion. Like immunoglobulins (Figure 17.15), the variable portion of

each TCR subunit contains regions that are exceptionally variable (i.e., hypervariable). The hypervariable regions form protruding loops (shown as the colored segments of the two TCR polypeptides in Figure 3) that fit snugly over the outer end of the MHC–peptide

Figure 2 Three-dimensional models of a peptide (shown in ball-andstick structure) bound within the antigen-binding pocket of a mouse MHC class I protein (H-2Kb). The peptide in a consists of eight amino acid residues; the peptide in b consists of nine residues. The peptides are seen to be buried deep within the MHC binding groove. (FROM MASAZUMI MATSUMURA, DAVED H. FREMONT, PER A. PETERSON, AND IAN A. WILSON, SCIENCE 257:932, 1992. © 1992. REPRINTED WITH PERMISSION FROM AAAS.)

(a)

(b)

17.4 Selected Topics on the Cellular and Molecular Basis of Immunity

Figure 1 (a) Schematic representation of an MHC class I molecule, in this case the human protein HLA-A2. The molecule consists of two subunits: a heavy chain made up of three domains (␣1, ␣2, and ␣3) and a ␤2m chain. The membrane-spanning portion of the heavy chain would connect with the polypeptide at the site marked C (for C-terminus). Disulfide bonds are indicated as two connected spheres. The peptide-binding groove is shown at the top of the drawing, between the ␣ helical segments of the ␣1 and ␣2 domains of the heavy chain.

730 complex. The hypervariable regions are referred to as complementaritydetermining regions, or CDRs, because they determine the binding properties of the TCR. The CDRs of the TCR interact with the ␣ helices of the ␣1 and ␣2 domains of the MHC, as well as the exposed residues of the bound peptide. The central CDRs of the TCR, which exhibit the greatest sequence variability, interact primarily with the centrally situated bound peptide, whereas the outer CDRs, which have a less variable sequence, interact most closely with the ␣ helices of the MHC.16 Because of these interactions, the TCR meets both of its recognition “responsibilities”: it recognizes the bound peptide as a foreign antigen and the MHC as a self-protein. (Information from recent studies of additional TCR–peptide–MHC structures can be found in References 17–19.)

Cβ

Cα

References

Vα

Vβ

P8

α1

P1

α2

β2m

Chapter 17 The Immune Response

α3

Figure 3 Representation of the interaction between an MHC–peptide complex (at the bottom) and a TCR (at the top). The hypervariable regions (CDRs) of the TCR are shown as colored loops that form the interface between the two proteins. The bound peptide (P1-P8) is shown in yellow-green as it is situated within the binding groove of the MHC class I molecule. The peptide backbones are represented as tubes. (FROM K. CHRISTOPHER GARCIA ET AL., COURTESY OF IAN WILSON, SCIENCE 274:217, 1996, FIG. 9D, © 1996. REPRINTED WITH PERMISSION FROM AAAS.)

1. OLDSTONE, M. B. A., ET AL. 1973. Histocompatibility-linked genetic control of disease susceptibility. J. Exp. Med. 137:1201–1212. 2. ZINKERNAGEL, R. M. & DOHERTY, P. C. 1974. Restriction of in vitro T cell-mediated cytotoxicity in lymphocytic choriomeningitis within a syngeneic or semiallogeneic system. Nature 248:701–702. 3. WALDRON, J. A., ET AL. 1974. Antigen-induced proliferation of guinea pig lymphocytes in vitro: Functional aspects of antigen handling by macrophages. J. Immunol. 112:746–755. 4. WEKERLE, H., ET AL. 1972. Fractionation of antigen reactive cells on a cellular immunoadsorbant. Proc. Nat’l. Acad. Sci. U.S. A. 69:1620–1624. 5. ZIEGLER, K. & UNANUE, E. R. 1982. Decrease in macrophage antigen catabolism caused by ammonia and chloroquine is associated with inhibition of antigen presentation to T cells. Proc. Nat’l. Acad. Sci. U.S. A. 79:175–178. 6. ZIEGLER, K. & UNANUE, E. R. 1981. Identification of a macrophage antigen-processing event required for I-region-restricted antigen presentation to T lymphocytes. J. Immunol. 127:1869–1875. 7. ZIEGLER, K. & UNANUE, E. R. 1979. The specific binding of Listeria monocytogenes-immune T lymphocytes to macrophages. I. Quantitation and role of H-2 gene products. J. Exp. Med. 150:1142–1160. 8. PURI, J. & LONAI, P. 1980. Mechanism of antigen binding by T cells H-2 (I-A)-restricted binding of antigen plus Ia by helper cells. Eur. J. Immunol. 10:273–281. 9. BJORKMAN, P. J., ET AL. 1987. Structure of the human class I histocompatibility antigen, HLA-A2. Nature 329:506–512. 10. BJORKMAN, P. J. 2006. Finding the groove. Nature Immunol. 7:787–789. 11. MADDEN, D. R., ET AL. 1992. The three-dimensional structure of HLAB27 at 2.1 Å resolution suggests a general mechanism for tight peptide binding to MHC. Cell 70:1035–1048. 12. FREMONT, D. H., ET AL. 1992. Crystal structures of two viral peptides in complex with murine MHC class I H-2Kb. Science 257:919–927. 13. MATSUMURA, M., ET AL. 1992. Emerging principles for the recognition of peptide antigens by MHC class I molecules. Science 257:927–934. 14. GARCIA, K. C., ET AL. 1996. An ␣␤ T cell receptor structure at 2.5 Å and its orientation in the TCR–MHC complex. Science 274:209–219. 15. GARBOCZI, D. N., ET AL. 1996. Structure of the complex between human T-cell receptor, viral peptide and HLA-A2. Nature 384:134–140. 16. WILSON, I. A. 1999. Class-conscious TCR? Science 286:1867–1868. 17. MARRACK, P., ET AL. 2008. Evolutionarily conserved amino acids that control TCR–MHC interaction. Annu. Rev. Immunol. 26:171–203. 18. ARCHBOLD, J. K., ET AL. 2008. T-cell allorecognition. Trends Immunol. 29:220–226. 19. GARCIA, K. C., ET AL. 2009. The molecular basis of TCR germline bias for MHC is surprisingly simple. Nature Immunol. 10:143–147.

731

| Synopsis Vertebrates are protected from infection by immune responses carried out by cells that can distinguish between materials that are “supposed” to be there (i.e., “self ”) and those that are not (i.e., foreign, or “nonself ”). Innate responses occur rapidly but lack a high level of specificity, whereas adaptive immune responses are highly specific but require a lag period of several days. Innate responses are carried out by patrolling phagocytic cells bearing Toll-like receptors, blood-borne molecules (complement) that can lyse bacterial cells, antiviral proteins (interferons), and natural killer cells that can cause infected cells to undergo apoptosis. Two broad categories of adaptive immunity can be distinguished: (1) humoral (blood-borne) immunity mediated by antibodies that are produced by cells derived from B lymphocytes (B cells) and (2) cell-mediated immunity carried out by T lymphocytes (T cells). B and T cells both arise from the same stem cell, which also gives rise to other types of blood cells. (p. 700) The cells of the immune system develop by clonal selection. During its development, each B cell becomes committed to producing only one species of antibody molecule, which is initially displayed as an antigen receptor in the cell’s plasma membrane. B-cell commitment occurs in the absence of antigen, so that the entire repertoire of antibody-producing cells is already present prior to stimulation by antigen. When a foreign substance appears in the body, it functions as an antigen by interacting with B cells that contain membrane-bound antibodies capable of binding that substance. Antigen binding activates the B cell, causing it to proliferate and form a clone of cells that differentiate into antibody-producing plasma cells. In this way, antigen selects those B cells that produce antibodies capable of interacting with it. Some of the antigenselected B cells remain as memory cells that can respond rapidly if the antigen is reintroduced. B cells whose antigen receptors are capable of reacting to the body’s own tissues are either inactivated or eliminated, causing the body to develop an immunologic tolerance toward itself. (p. 704) T lymphocytes recognize antigens by their T-cell receptors (TCRs). T cells can be divided into three distinct subclasses: helper T cells (TH cells) that activate B cells, cytotoxic T lymphocytes (CTLs) that kill foreign or infected cells, and regulatory T cells (TReg cells) that generally suppress other T cells. Unlike B cells, which are activated by soluble, intact antigens, T cells are activated by fragments of antigens that are displayed on the surfaces of other cells, called antigen-presenting cells (APCs). Whereas any infected cell can activate a CTL, only professional APCs, such as macrophages and dendritic cells, that function in antigen ingestion and processing can activate TH cells. Cytotoxic T cells kill target cells by secreting proteins that permeabilize the membrane of the target cell and induce it to undergo apoptosis. (p. 707) Antibodies are globular proteins called immunoglobulins (Igs) that are built of two types of polypeptide chains: heavy and light chains. Antibodies are divided into several classes (IgA, IgG, IgD,

IgE, and IgM) depending on their heavy chain. Heavy and light chains both consist of (1) constant (C) regions, that is, regions whose amino acid sequence is virtually identical among all heavy or light chains of a given class, and (2) variable (V) regions, that is, regions whose sequence varies from one antibody species to another. Different classes of antibody appear at different times after exposure to an antigen and have different biological functions. IgG is the predominant form of blood-borne antibody. Each Y-shaped IgG molecule contains two identical heavy chains and two identical light chains. The antigen-combining sites are formed by association of the V region of a light chain with the V region of a heavy chain. Included within the V regions are hypervariable regions that form the walls of the antigen-combining site. (p. 710) Antigen receptors on both B and T cells are encoded by genes produced by DNA rearrangement. Each variable region of a light Ig chain is composed of two DNA segments (a V and J segment) that are joined together, whereas each variable region of a heavy Ig chain is composed of three joined segments (a V, J, and D segment). Variability arises as the result of the joining of different V genes with different J genes in different antibody-producing cells. Additional variability is introduced by the enzymatic insertion of nucleotides and variation in the V–J joining site. It is estimated that an individual can synthesize more than 2000 different species of kappa light chains and 100,000 species of heavy chains, which can combine randomly to form more than 200 million antibodies. Additional diversity arises in antibody-producing cells by somatic hypermutation in which the rearranged V genes experience a mutation rate much greater than the remainder of the genome. (p. 713) Antigens are taken up by APCs, fragmented into short peptides, and combined with proteins encoded by the major histocompatibility complex (MHC) for presentation to T cells. MHC proteins can be subdivided into two classes: class I molecules and class II molecules. MHC class I molecules primarily display peptides derived from endogenous cytosolic proteins, including those derived from an infecting virus. Nearly any cell of the body can display peptides through MHC class I molecules to CTLs. If the TCR of the CTL recognizes a foreign peptide displayed by a cell, the CTL becomes activated to kill the target cell. Professional APCs, including macrophages and dendritic cells, display peptides in combination with MHC class II peptides. The MHC–peptide complexes on the APC are recognized by TCRs on the surface of TH cells. Helper T cells that become activated by antigens displayed on an APC subsequently associate with and activate B cells whose antigen receptors (BCRs consisting of Igs) recognize the same antigen. Helper T cells stimulate B cells by direct interaction and by secretion of cytokines. Activation of a T cell by a TCR leads to the stimulation of protein tyrosine kinases within the T cell, which in turn leads to the activation of a number of signaling pathways, including the activation of Ras and the MAP kinase cascade and the activation of phospholipase C. (p. 716)

Synopsis

732

18 Techniques in Cell and Molecular Biology 18.1 18.2 18.3 18.4 18.5 18.6 18.7 18.8 18.9 18.10 18.11 18.12 18.13 18.14 18.15 18.16 18.17 18.18

The Light Microscope Transmission Electron Microscopy Scanning Electron and Atomic Force Microscopy The Use of Radioisotopes Cell Culture The Fractionation of a Cell’s Contents by Differential Centrifugation Isolation, Purification, and Fractionation of Proteins Determining the Structure of Proteins and Multisubunit Complexes Fractionation of Nucleic Acids Nucleic Acid Hybridization Chemical Synthesis of DNA Recombinant DNA Technology Enzymatic Amplification of DNA by PCR DNA Sequencing DNA Libraries DNA Transfer into Eukaryotic Cells and Mammalian Embryos Determining Eukaryotic Gene Function by Gene Elimination or Silencing The Use of Antibodies

Because of the very small size of the subject matter, cell and molecular biology is more dependent on the development of new instruments and technologies than any other branch of biology. Consequently, it is difficult to learn about cell and molecular biology without also learning about the technology that is required to collect data. In this chapter, we will survey the methods used most commonly in the field without becoming

The use of dual-label fluorescence to follow dynamic events within cellular organelles of living cells. The Golgi cisternae of a budding yeast cell are not organized into distinct stacks as in most eukaryotic cells but are dispersed within the cytoplasm. Each of the brightly colored oval structures is an individual cisterna whose color is due to the localization of fluorescently labeled protein molecules. A cisterna that appears green contains a GFP-labeled protein (Vrg4) that is involved in early Golgi activities, whereas a cisterna that appears red contains a DsRed-labeled protein (Sec7) that is involved in late Golgi activities. This series of micrographs reveals the protein composition of individual cisternae over a period of about 13 minutes (elapsed time is indicated in the lower left corner). The white arrowhead and arrow point out two of these cisternae over time. These two cisternae are shown imaged by themselves in the lower rows of micrographs. It can be seen from this series that the protein composition of an individual cisterna changes over time from one containing “early” Golgi proteins (green) to one containing “late” Golgi proteins (red). [CONTINUED ON NEXT PAGE]

733 These findings provide direct visual support for the cisternal maturation model discussed on page 293. (FROM EUGENE LOSEV ET AL., COURTESY OF BENJAMIN S. GLICK, NATURE 441:1004, 2006, FIG. 3A. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LTD.) immersed in the details or the many variations that are employed. These are the goals of the present chapter: to describe the ways that selected techniques are used and to provide examples of the types of information that can be learned by using these

techniques. We will begin with the instrument that enabled biologists to discover the very existence of cells, providing the starting point for all of the information that has been presented in this text.

18.1 | The Light Microscope

the specimen and the objective lens changes, allowing the final image to become focused precisely on the plane of the retina. The total magnification attained by the microscope is the product of the magnifications produced by the objective lens and the ocular lens.

Microscopes are instruments that produce an enlarged image of an object. Figure 18.1 shows the most important components of a compound light microscope. A light source, which may be external to the microscope or built into its base, illuminates the specimen. The substage condenser lens gathers the diffuse rays from the light source and illuminates the specimen with a small cone of bright light that allows very small parts of the specimen to be seen after magnification. The light rays focused on the specimen by the condenser lens are then collected by the microscope’s objective lens. From this point, we need to consider two sets of light rays that enter the objective lens: those that the specimen has altered and those that it hasn’t (Figure 18.2). The latter group consists of light from the condenser that passes directly into the objective lens, forming the background light of the visual field. The former group of light rays emanates from the many parts of the specimen and forms the image of the specimen. These light rays are brought to focus by the objective lens to form a real, enlarged image of the object within the column of the microscope (Figure 18.1). The image formed by the objective lens is used as an object by a second lens system, the ocular lens, to form an enlarged and virtual image. A third lens system located in the front part of the eye uses the virtual image produced by the ocular lens as an object to produce a real image on the retina. When the focusing knob of the light microscope is turned, the relative distance between

Resolution Up to this point we have considered only the magnification of an object without paying any attention to the quality of the image produced, that is, the extent to which the detail of the

Plane of focus of image

Plane of focus of light source

Objective lens 2α

Ocular lens Plane of specimen

Focal length of objective lens

Lamp Objective lens Specimen

Light source

Figure 18.1 Sectional diagram through a compound light microscope; that is, a microscope that has both an objective and an ocular lens.

) Light rays that form the image

(

) Background light of the field

Figure 18.2 The paths taken by light rays that form the image of the specimen and those that form the background light of the field. Light rays from the specimen are brought to focus on the retina, whereas background rays are out of focus, producing a diffuse bright field. As discussed later in the text, the resolving power of an objective lens is proportional to the sine of the angle ␣. Lenses with greater resolving power have shorter focal lengths, which means that the specimen is situated closer to the objective lens when it is brought into focus.

18.1 The Light Microscope

Condenser lens

(

734

(a)

(b)

(c)

Chapter 18 Techniques in Cell and Molecular Biology

Figure 18.3 Magnification versus resolution. The transition from (a) to (b) provides the observer with increased magnification and resolution, whereas the transition from (b) to (c) provides only increased magnification (empty magnification). In fact, the quality of the image actually deteriorates as empty magnification increases.

specimen is retained in the image. Suppose you are looking at a structure in the microscope using a relatively high-power objective lens (for example, 63⫻) and an ocular lens that magnifies the image of the objective lens another fivefold (a 5⫻ ocular). Suppose the field is composed of chromosomes and it is important to determine the number present, but some of them are very close together and cannot be distinguished as separate structures (Figure 18.3a). One solution to the problem might be to change oculars to increase the size of the object being viewed. If you were to switch from the 5⫻ to a 10⫻ ocular, you would most likely increase your ability to determine the number of chromosomes present (Figure 18.3b) because you have now spread the image of the chromosomes produced by the objective lens over a greater part of your retina. The more photoreceptors that provide information about the image, the more detail that can be seen (Figure 18.4). If, however, you switch to a 20⫻ ocular, you are not likely to see additional detail, although the image is larger (Figure 18.3c), that is, occupies more retinal surface. The change of ocular fails to provide additional information because the image produced by the objective lens does not possess any further detail to be enhanced by the increased power of the ocular lens. The second switch in oculars provides only empty magnification (as in Figure 18.3c). The optical quality of an objective lens is measured by the extent to which the fine detail present in a specimen can be discriminated, or resolved. The resolution attained by a microscope is limited by diffraction. Because of diffraction, light

Stimulated photoreceptor Unstimulated photoreceptor

Figure 18.4 The resolving power of the eye. A highly schematic illustration of the relationship between the stimulation of individual photoreceptors (left) and the resulting scene one would perceive (right). The diagram illustrates the value of having the image fall over a sufficiently large area of the retina.

emanating from a point in the specimen can never be seen as a point in the image, but only as a small disk. If the disks produced by two nearby points overlap, the points cannot be distinguished in the image. Thus the resolving power of a microscope can be defined in terms of the ability to see two neighboring points in the visual field as two distinct entities. The resolving power of a microscope is limited by the wavelength of the illumination according to the equation 0.61 l n sin a where d is the minimum distance that two points in the specimen must be separated to be resolved, ␭ is the wavelength of light (527 nm is used for white light), and n is the refractive index of the medium present between the specimen and objective lens. Alpha (␣) is equal to half the angle of the cone of light entering the objective lens as shown in Figure 18.2. Alpha is a measure of the light-gathering ability of the lens and is directly related to its aperture. The denominator of the equation in column 1 is called the numerical aperture (N. A.). The numerical aperture is a constant for each lens, a measure of its light-gathering qualities. For an objective that is designed for use in air, the maximum possible N.A. is 1.0 because the sine of the maximum possible angle of ␣, 90⬚, is 1, and the refractive index of air is 1.0. For an objective designed to be immersed in oil, the maximum possible N.A. is approximately 1.5. As a common rule of thumb, the maximum useful magnification of a light microscope ranges between 500 and 1000 times the numerical aperture of the objective lens being used. Attempts to enlarge the image beyond this point result in empty magnification, and the quality of the image deteriorates. High numerical aperture is achieved by using lenses with a short focal length, which requires the lens to be placed very close to the specimen. If we substitute the minimum possible wavelength of illumination and the greatest possible numerical aperture in the preceding equation, we can determine the limit of resolution of the conventional light microscope. When these substitutions are made, a value of slightly less than 0.2 ␮m (or 200 nm) is obtained, which is sufficient to resolve larger cellular organelles, such as nuclei and mitochondria. In contrast, the limit of resolution of the naked eye, which has a numerical aperture of about 0.004, is approximately 0.1 mm. In addition to these theoretical factors, resolving power is also affected by optical flaws, or aberrations. There are seven important aberrations, and they are the handicaps that lensmakers must overcome to produce objective lenses whose actual resolving power approaches that of the theoretical limits. Objective lenses are made of a complex series of lenses, rather than a single convergent lens, to eliminate these aberrations. One lens unit typically affords the required magnification, while the others compensate for errors in the first lens to provide a corrected overall image. d⫽

Visibility On the more practical side of microscopy from that of limits of resolution is the subject of visibility, which is concerned with factors that allow an object actually to be observed. This

735

Figure 18.5 The Feulgen stain. This staining procedure is specific for DNA as indicated by the localization of the dye to the chromosomes of this onion root tip cell that was in metaphase of mitosis at the time it was fixed. (ED RESCHKE.)

Preparation of Specimens for Bright-Field Light Microscopy Specimens to be observed with the light microscope are broadly divided into two categories: whole mounts and sections. A whole mount is an intact object, either living or dead, and can consist of an entire microscopic organism such as a protozoan or a small part of a larger organism. Most tissues of plants and animals are much too opaque for microscopic analysis unless examined as a very thin slice, or section. To prepare a section, the cells are first killed by immersing the tissue in a chemical solution, called a fixative. A good fixative rapidly penetrates the cell membrane and immobilizes all of its macromolecular material so that the structure of the cell is maintained as close as possible to that of the living state. The most common fixatives for the light microscope are solutions of formaldehyde, alcohol, or acetic acid. After fixation, the tissue is dehydrated by transfer through a series of alcohols and then usually embedded in paraffin (wax), which provides mechanical support during sectioning. Paraffin is used as an embedding medium because it is readily dissolved by organic solvents. Slides containing adherent paraffin sections are immersed in toluene, which dissolves the wax, leaving the thin slice of tissue attached to the slide, where it can be stained or treated with antibodies or other agents. After staining, a coverslip is mounted over the tissue using a mounting medium that has the same refractive index as the glass slide and coverslip.

Phase-Contrast Microscopy Small, unstained specimens such as a living cell can be very difficult to see with a bright-field microscope (Figure 18.6a). The phase-contrast microscope makes highly transparent

18.1 The Light Microscope

may seem like a trivial matter; if an object is there, it should be seen. Consider the case of a clear glass bead. Under most conditions, against most backgrounds, the bead is clearly visible. If, however, a glass bead is dropped into a beaker of immersion oil having the same refractive index as glass, the bead disappears from view because it no longer affects light in any obvious manner that is different from the background fluid. Anyone who has spent any time searching for an amoeba can appreciate the problem of visibility when using the light microscope. What we see through a window or through a microscope are those objects that affect the light differently from their background. Another term for visibility in this sense of the word is contrast, or the difference in appearance between adjacent parts of an object or between an object and its background. The need for contrast can be appreciated by considering the stars. Whereas a clear night sky can be filled with stars, the same sky during the day appears devoid of celestial bodies. The stars have disappeared from view, but they haven’t disappeared from the sky. They are no longer visible against the much brighter background. In the macroscopic world, we examine objects by having the light fall on them and then observing the light that is reflected back to our eyes. In contrast, when we use a microscope, we place the specimen between the light source and our eyes and view the light that is transmitted through the object (or more properly, diffracted by the object). If you take an object and go into a room with one light source and hold the object between the light source and your eye, you can appreciate part of the difficulty in such illumination; it requires that the object being examined be nearly transparent, that is, translucent. Therein lies another aspect of the problem: objects that are “nearly transparent” can be difficult to see. One of the best ways to make a thin, translucent specimen visible under the microscope is to stain it with a dye, which absorbs only certain wavelengths within the visible spectrum. Those wavelengths that are not absorbed are transmitted to the eye, causing the stained object to appear colored. Different dyes bind to different types of biological molecules, and therefore, not only do these procedures heighten visibility of the specimen, they can also indicate where in the cells or tissues different types of substances are found. A good example is the Feulgen stain, which is specific for DNA, causing the chromosomes to appear colored under the microscope (Figure 18.5). A problem with stains is that they generally cannot be used with living cells; they are usually toxic, or the staining conditions are toxic, or the stains do not penetrate the plasma membrane. The Feulgen stain, for example, requires that the tissue be hydrolyzed in acid before the stain is applied. Different types of light microscopes use different types of illumination. In a bright-field microscope, the cone of light that illuminates the specimen is seen as a bright background against which the image of the specimen must be contrasted. Bright-field microscopy is ideally suited for specimens of high contrast, such as stained sections of tissues, but it may not provide optimal visibility for other specimens. In the following sections, we will consider various means of making specimens more visible in a light microscope.

736

(a)

(b)

Chapter 18 Techniques in Cell and Molecular Biology

(c)

Figure 18.6 A comparison of cells seen with different types of light microscopes. Light micrographs of a ciliated protist as observed under bright-field (a), phase-contrast (b), and differential interference contrast (DIC) (or Nomarski) optics (c). The organism is barely visible under bright-field illumination but clearly seen under phase-contrast and DIC microscopy. (MICROGRAPHS BY M. I. WALKER/PHOTO RESEARCHERS, INC.)

objects more visible (Figure 18.6b). We can distinguish different parts of an object because they affect light differently from one another. One basis for such differences is refractive index. Cell organelles are made up of different proportions of various molecules: DNA, RNA, protein, lipid, carbohydrate, salts, and water. Regions of different composition are likely to have different refractive indices. Normally such differences cannot be detected by our eyes. However, the phase-contrast microscope converts differences in refractive index into differences in intensity (relative brightness and darkness), which are visible to the eye. Phase-contrast microscopes accomplish this result by (1) separating the direct light that enters the objective lens

from the diffracted light emanating from the specimen and (2) causing light rays from these two sources to interfere with one another. The relative brightness or darkness of each part of the image reflects the way in which the light from that part of the specimen interferes with the direct light. Phase-contrast microscopes are most useful for examining intracellular components of living cells at relatively high resolution. For example, the dynamic motility of mitochondria, mitotic chromosomes, and vacuoles can be followed and filmed with these optics. Simply watching the way tiny particles and vacuoles of cells are bumped around in a random manner in a living cell conveys an excitement about life that is unattainable by observing stained, dead cells. The greatest benefit derived from the invention of the phase-contrast microscope has not been in the discovery of new structures, but in its everyday use in research and teaching labs for observing cells in a more revealing way. The phase-contrast microscope has optical handicaps that result in loss of resolution, and the image suffers from interfering halos and shading produced where sharp changes in refractive index occur. The phase-contrast microscope is a type of interference microscope. Other types of interference microscopes minimize these optical artifacts by achieving a complete separation of direct and diffracted beams using complex light paths and prisms. Another type of interference system, termed differential interference contrast (DIC), or sometimes Nomarski interference after its developer, delivers an image that has an apparent three-dimensional quality (Figure 18.6c). Contrast in DIC microscopy depends on the rate of change of refractive index across a specimen. As a consequence, the edges of structures, where the refractive index varies markedly over a relatively small distance, are seen with especially good contrast.

Fluorescence Microscopy (and Related Fluorescence-Based Techniques) Over the past couple of decades, the light microscope has been transformed from an instrument designed primarily to examine sections of fixed tissues to one capable of observing the dynamic events occurring at the molecular level in living cells. These advances in live-cell imaging have been made possible to a large extent by innovations in fluorescence microscopy. The fluorescence microscope allows viewers to observe the location of certain molecules (called fluorophores or fluorochromes). Fluorophores absorb invisible, ultraviolet radiation and release a portion of the energy in the longer, visible wavelengths, a phenomenon called fluorescence. The light source in a fluorescence microscope produces a beam of ultraviolet light that travels through a filter, which blocks all wavelengths except that which is capable of exciting the fluorophore. The beam of monochromatic light is focused on the specimen containing the fluorophore, which becomes excited and emits light of a visible wavelength that is focused by the objective lens into an image that can be seen by the viewer. Because the light source produces only ultraviolet (black) light, objects stained with a fluorophore appear brightly colored against a black background, providing very high contrast.

737

(CFP) were generated by Roger Tsien of the University of California, San Diego, through directed mutagenesis of the GFP gene. In addition, a distantly related red fluorescent tetrameric protein (DsRed) has been isolated from a sea anemone. Monomeric variants of DsRed (e.g., mBanana, mTangerine, and mOrange), which fluoresce in a variety of distinguishable colors, have also been generated by mutagenesis experiments in Tsien’s lab. The type of information that can be obtained using this colorful “palette” of GFP variants is illustrated by the study depicted in Figure 18.7, in which researchers generated strains of mice whose neurons contained differently colored fluorescent proteins. When a muscle of one of these mice was exposed surgically, the investigators could observe the dynamic interactions between the variously colored neurons and the neuromuscular junctions being innervated (see Figure 4.56 for a drawing of this type of junction). They watched, for example, as branches from a CFP-colored neuron competed with branches from a YFP-colored neuron for synaptic contact with the muscle tissue. In each case they found that, when two neurons compete for innervation of different muscle fibers, all of the “winning” branches belong to one of the neurons, while all of the “losing” branches belong to the other neuron (Figure 18.7b). The chapter-opening micrographs provide another example of how much can be learned about the spatial and temporal dynamics of cellular events

(a)

(b)

Figure 18.7 Use of GFP variants to follow the dynamic interactions between neurons and their target cells in vivo. (a) A portion of the brain of a mouse with two differently colored, fluorescent neurons. These mice are generated by mating transgenic animals whose neurons are labeled with one or the other flurosecent protein. (b) Fluorescence micrograph of portions of two neurons, one labeled with YFP and the other with CFP. The arrows indicate two different neuromuscular junctions on two different muscle fibers in which the YFP-labeled axonal branch has outcompeted the CFP-labeled branch. The third junction is innervated by the CFP-labeled axon in the absence of competition. (A: COURTESY N. KASTHURI & J. W. LICHTMAN, WASHINGTON UNIVERSITY SCHOOL OF MEDICINE; B: FROM N. KASTHURI AND JEFF W. LICHTMAN, NATURE 424:429, 2003, FIG. 4A. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LTD.)

18.1 The Light Microscope

There are many different ways that fluorescent molecules can be used in cell and molecular biology. In one of its most common and earliest applications, a relatively small organic fluorophore (such as rhodamine or fluorescein) is covalently linked (conjugated) to an antibody to produce a fluorescent antibody that can be used to determine the location of a specific protein within the cell. This technique is called immunofluorescence and is illustrated by the micrograph in Figure 9.30a. Immunofluorescence is discussed further on page 783. Fluorescently labeled proteins can also be used to study dynamic processes as they occur in a living cell. For example, a small organic fluorophore can be linked to a cellular protein, such as actin or tubulin, and the fluorescently labeled protein injected into a living cell (as in Figures 9.4 and 9.73b). Fluorophores can also be used to locate DNA or RNA molecules that contain specific nucleotide sequences as described on page 402 and illustrated in Figure 10.19. In other examples, fluorophores have been used to study the sizes of molecules that can pass between cells (see Figure 7.33), as indicators of transmembrane potentials (see Figure 5.21), or as probes to determine the free Ca2⫹ concentration in the cytosol (see Figure 15.29). The use of calcium-sensitive fluorophores is discussed on page 649. In all of the examples discussed in the previous paragraph, the biomolecules have been made fluorescent by conjugating them with a synthetic organic fluorophore. But there are also naturally fluorescent molecules. Humans have probably wondered for thousands of years why jellyfish and certain other marine organisms glow in the dark. It wasn’t until the early 1960s that Osamu Shimomura discovered that a certain species of jellyfish (Aequorea victoria) owes its luminescent character to the presence of fluorescent proteins, such as aequorin and the green fluorescent protein (GFP), that he was able to purify and analyze. In the early 1990s, Douglas Prasher, Martin Chalfie and colleagues cloned the gene that encodes GFP and showed that the fluorescent protein could be genetically incorporated and expressed in another organism. This study set the stage for the use of GFP to study the spatial and temporal distribution of proteins in living cells. Unlike most other fluorescent proteins, GFP does not require an additional cofactor to absorb and emit light; instead, the light-absorbing/emitting chromophore is formed by selfmodification (i.e., by an autocatalytic reaction) of three of the amino acids that make up the primary structure of the GFP polypeptide. In most studies employing GFP, a recombinant DNA is constructed in which the coding region of the GFP gene is joined to the coding region of the gene for the protein under study. This recombinant DNA is used to transfect cells, which then synthesize a chimeric protein containing fluorescent GFP fused to the protein under study. Use of GFP to study membrane trafficking is discussed on page 273. In these various experimental protocols, the labeled proteins participate in the normal activities of the cell, and their location can be followed microscopically to reveal the dynamic activities in which the protein participates (see Figures 8.4 and 9.2). Live-cell imaging studies can often be made more informative by the simultaneous use of GFP variants that exhibit different spectral properties. Variants of GFP that fluoresce in shades of blue (BFP), yellow (YFP), and cyan

738

using two (or more) spectrally distinct fluorescent proteins. In this case, the dual-label strategy has allowed investigators to follow the simultaneous movements of two different proteins in real time as they occur within the boundaries of a single cellular organelle. Another example is described in the Experimental Pathway of Chapter 8 (page 321). GFP variants have also been useful in a technique, called fluorescence (or Förster) resonance energy transfer (FRET), which can measure distances between fluorophores in the nanoscale range. FRET is typically employed to measure changes in distance between two parts of a protein (or between two separate proteins within a larger structure). FRET can be used to study such changes as they occur in vitro or within a living cell. FRET is based on the fact that excitation energy can be transferred from one fluorescent group (a donor) to another fluorescent group (an acceptor), as long as the two groups are in very close proximity (1 to 10 nm). This transfer of energy reduces the fluorescence intensity of the donor and increases the fluorescence intensity of the acceptor. The efficiency of transfer between two fluorescent groups that are bound to strategic sites on a protein decreases sharply as the distance between the two groups increases. As a result, determination of changes in fluorescence of the donor and acceptor groups that occur during a cellular process provides a measure of changes in the distance between them at various stages in the process. This technique is illustrated in Figure 18.8, where two different GFP variants (labeled ECFP and EYFP) have been covalently linked to two different parts of a cGMP-dependent protein kinase (PKG).

PKG

Chapter 18 Techniques in Cell and Molecular Biology

PKG

In the absence of bound cGMP, the two fluorophores are too far apart for energy transfer to occur. Binding of cGMP induces a conformational change in the protein that brings the two fluorophores into close enough proximity for FRET to occur. FRET can also be used to follow many other processes including protein folding or the association and dissociation of components within a membrane. The separation of the cytoplasmic tails of integrin subunits following activation by talin (Figure 7.14) is an example of an event that has been studied by FRET. Figure 15.8 shows an early example where FRET was used to reveal the changes in cAMP levels following binding of a neurotransmitter with its surface receptor. Another innovation in fluorescence microscopy has been the development of computational programs that can be used to examine large numbers of images and score them for various characteristics. These automated technologies have been particularly useful for screening the phenotypes of cells that have been subjected to a library of different siRNAs, looking for genes encoding proteins that are involved in a particular cellular process, such as membrane trafficking (page 278) or mitosis (page 780). In addition to being tedious and timeconsuming, attempting to analyze galleries of images such as that seen in Figure 18.50 by manual inspection is likely to introduce errors resulting from inherent subjective differences and biases of human observers. Another recent innovation in fluorescence microscopy falls under the heading of multiphoton microscopy, in which fluorophores present within cells are excited by the simultaneous arrival of two or more photons of longer wavelength. The longer the wavelength of the photon, the less its energy (which makes it less destructive to the illuminated cells and to the absorbing fluorophore) and the greater its penetrating power. Using multiphoton microscopy, researchers can track the movements of fluorescent proteins that are present at least 200 ␮m deep within a living tissue. An example of the use of this approach can be seen in Figure 17.11, where fluorescently labeled immune cells are followed as they move around within an excised lymph node.

Video Microscopy and Image Processing

Figure 18.8 Fluorescence resonance energy transfer (FRET). This schematic diagram shows an example of the use of FRET technology to follow the change in conformation of a protein (PKG) following cGMP binding. The two small, barrel-shaped fluorescent proteins— enhanced CFP (ECFP) and enhanced YFP (EYFP)—are shown in their fluorescent color. In the absence of cGMP, excitation of ECFP with light of 440 nm leads to emission of light of 480 nm from the fluorescent protein. Following cGMP binding, a conformational change in PKG brings the two fluorescent proteins into close enough proximity for FRET to occur. As a result, excitation of the ECFP donor with light of 440 nm leads to energy transfer to the EYFP acceptor and emission of light of 535 nm from the acceptor. The enhanced versions of these proteins have greater fluorescence intensity and tend to be more stable than the original protein molecules. (FROM MORITOSHI SATO AND YOSHIO UMEZAWA, ANALYTICAL CHEMISTRY 72:5924, 2000. © 2000 AMERICAN CHEMICAL SOCIETY.)

Just as a microscopic field can be seen with the eyes or filmed by a camera, it can also be viewed electronically and taped using a video camera. Video cameras offer a number of advantages for viewing specimens. Special types of video cameras (called charge-coupled device, or CCD cameras) are constructed to be very sensitive to light, which allows them to image specimens at very low illumination. This is particularly useful when observing live specimens, which are easily damaged by the heat from a light source, and fluorescently stained specimens, which fade rapidly on exposure to light. In addition, video cameras can detect and amplify very small differences in contrast within a specimen so that very small objects become visible. The photographs in Figure 9.22a, for example, show images of individual microtubules (diameter of 0.025 ␮m) that are far below the limit of resolution of a conventional light microscope (0.2 ␮m). As an added advantage, electronic images produced by video cameras are readily converted to

739

digital images. Digital images consist of a discrete number of picture elements (pixels) each of which has an assigned color and brightness value corresponding to that site in the original image. Digital images can be stored as computer files and subjected to computer processing to greatly increase their information content. In one technique, the distracting out-of-focus background in a visual field is stored by the computer and then subtracted from the image containing the specimen. This greatly increases the clarity of the image. Similarly, differences in brightness in an image can be converted to differences in color, which makes them much more apparent to the eye.

Laser Scanning Confocal Microscopy The use of video cameras, electronic images, and computer processing has led to a renaissance in light microscopy over the past couple of decades. So too has the development of a new type of light microscope. When a whole cell or a section of an organ is examined under a standard light microscope, the observer views the specimen at different depths by changing the position of the objective lens by rotating the focusing knob. As this is done, different parts of the specimen go in and out of focus. But the fact that the specimen contains different levels of focus reduces the ability to form a crisp image because those parts of the specimen above and below the plane of focus interfere with the light rays from that part that is in the plane of focus. In the late 1950s, Marvin Minsky of the Massachusetts Institute of Technology invented a revolutionary new instrument called a confocal microscope that produces an image of a thin plane situated within a much thicker specimen. A schematic diagram of the optical components

Computer

Photomultiplier Pinhole aperture

Dichroic mirror Illuminating aperture

Laser

Objective lens

Specimen Focal plane (a)

(b)

18.1 The Light Microscope

Figure 18.9 Laser scanning confocal fluorescence microscopy. (a) The light paths in a confocal fluorescence microscope. Light of short (blue) wavelength is emitted by a laser source, passes through a tiny aperture, and is reflected by a dichroic mirror (a type of mirror that reflects certain wavelengths and transmits others) into an objective lens and focused onto a spot in the plane of the specimen. Fluorophores in the specimen absorb the incident light and emit light of longer wavelength, which is able to pass through the dichroic mirror and come to focus in a plane that contains a pinhole aperture. The light then passes into a photomultiplier tube that amplifies the signal and is transmitted to a computer which forms a processed, digitized image. Any light rays that are emitted from above or below the optical plane in the specimen are prevented from passing through the pinhole aperture and thus do not contribute to formation of the image. This diagram shows the illumination of a single spot in the specimen. Different sites within this specimen plane are illuminated by means of a laser scanning process. The diameter of the pinhole aperture is adjustable. The smaller the aperture, the thinner the optical section and the greater the resolution, but the less intense the signal. (b) Confocal fluorescence micrographs of three separate optical sections, each 0.3 ␮m thick, of a yeast nucleus stained with two different fluorescently labeled antibodies. The red fluorescent antibody has stained the DNA within the nucleus, and the green fluorescent antibody has stained a telomere-binding protein that is localized at the periphery of the nucleus. (A: FROM THIERRY LAROCHE AND SUSAN M. GASSER, CELL 75:543, 1993, CELL BY CELL PRESS. REPRODUCED WITH PERMISSION OF CELL PRESS IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

and light paths within a modern version of a laser scanning confocal microscope is shown in Figure 18.9a. In this type of microscope, the specimen is illuminated by a finely focused laser beam that rapidly scans across the specimen at a single depth, thus illuminating only a thin plane (or “optical section”) within the specimen. As indicated in Figure 18.9a, confocal microscopes are used with fluorescence optics. As described earlier, short-wavelength incident light is absorbed by the fluorophores in a specimen and reemitted at longer wavelength. Light emitted from the specimen is brought to focus at a site within the microscope that contains a pinhole aperture. Thus, the aperture and the illuminated plane in the specimen are confocal. Light rays emitted from the illuminated plane of the specimen can pass through the aperture, whereas any light rays that might emanate from above or below this plane are prevented from participating in image formation. As a result, out-of-focus points in the specimen become invisible.

740

Figure 18.9b shows micrographs of a single cell nucleus taken at three different planes within the specimen. It is evident that objects out of the plane of focus have little effect on the quality of the image of each section. If desired, the images from separate optical sections can be stored in a computer and merged with one another to reconstruct a three-dimensional model of the entire object.

Super-Resolution Fluorescence Microscopy

Chapter 18 Techniques in Cell and Molecular Biology

The most recent innovations in fluorescence microscopy fall under the umbrella of “super-resolution” technologies. It was pointed out on page 734 that the limit of resolution of the light microscope is approximately 200 nm owing to the diffraction properties of light. Over the past few years, a number of complex optical techniques have been developed that allow researchers to localize fluorescently labeled proteins at resolutions in the tens of nanometers. Most of these super-resolution techniques are based on the discovery that a certain mutation of the GFP polypeptide converts the protein into a photoactivatable molecule (PA-GFP). This mutant GFP remains essentially nonfluorescent until it is activated by violet light. Subsequent studies have expanded the number of available photoactivatable fluorescent proteins and have led to the development of photoswitchable proteins, whose fluorescence emission at a particular wavelength can be switched on and off with pulses of light. We will briefly consider only one of the super-resolution techniques, called STORM (stochastic optical reconstruction microscopy), that utilizes photoswitchable fluorescent proteins. STORM allows investigators to localize a single fluorescent molecule within a resolution of less than 20 nm. In this

(a)

(b)

Figure 18.10 Breaking the light microscope’s limit of resolution. (a) A conventional fluorescence micrograph of a portion of a cultured mammalian cell with microtubules labeled in green and clathrin-coated pits in red. At this high level of magnification, the image appears pixelated. Moreover, the overlap between the two fluorescent labels produces an orange color, which suggests that the two structures are interconnected. (b) A STORM “super-resolution” micrograph of a similar field showing the microtubules and clathrin-coated pits clearly

technique, a specimen containing one or more photoswitchable fluorophores is illuminated with pulses of light of appropriate wavelengths, which has the effect of switching the fluorescence of the labeled molecules on and off. During each cycle of illumination, most of the labeled molecules remain dark, but a small fraction of them are randomly activated. These individual activated fluorescent molecules appear as spots that, because of the diffraction properties of light, are several hundred nanometres wide. The center of each spot, which represents the actual location of the single fluorophore responsible for emitting the light, can be determined with high accuracy. This process is then repeated for a number of imaging cycles, producing a large number of coordinates that represent the locations of many of the fluorophores in the specimen. Analysis of these coordinates allows the researchers to reconstruct a super-resolution image of the field of view. The marked increase in resolution that can be attained with the super-resolution STORM technique versus that of a conventional fluorescence technique is evident by comparing the image of Figure 18.10a with those of Figures 18.10b and c.

18.2 | Transmission Electron Microscopy The electron micrographs shown in this text were taken by two different types of electron microscopes. Transmission electron microscopes (TEMs) form images using electrons that are transmitted through a specimen, whereas scanning electron microscopes (SEMs) utilize electrons that have bounced off the

(c)

resolved and spatially separated from one another. (c) A magnified view of a portion of part b. (Discussion of this and other super-resolution techniques can be found in Nature 459:638, 2009; Ann. Rev. Biochem. 78:993, 2009; Cell 143:1047, 2010; J. Cell Biol. 190:165, 2010; and J. Cell Sci. 124:1607, 2011.) (FROM MARK BATES ET AL., COURTESY OF XIAOWEI ZHUANG, SCIENCE 317:1752, 2007; © 2007. REPRINTED WITH PERMISSION FROM AAAS.)

741

(b)

(a)

two-hundred-fold increase in resolution. Note the difference in the details of the muscle myofibrils, mitochondria, and the capillary containing a red blood cell. Whereas the light microscope cannot provide any additional information to that of a, the electron microscope can provide much more information, producing images, for example, of the structure of the individual membranes within a small portion of one of the mitochondria (as in Figure 5.22). (COURTESY OF DOUGLAS E. KELLY AND M. A. CAHILL.)

surface of the specimen. All of the comments in this section of the chapter concern the use of the TEM; the SEM is discussed separately on page 746. The transmission electron microscope can provide vastly greater resolution than the light microscope. This is readily illustrated by comparing the two photos in Figure 18.11, which show images of adjacent sections of the same tissue at the same magnification using a light or an electron microscope. Although the photo in Figure 18.11a is near the limit of resolution of the light microscope, the photo in Figure 18.11b is a very low-power electron micrograph. The great resolving power of the electron microscope derives from the wave properties of electrons. As indicated in the equation on page 734, the limit of resolution of a microscope is directly proportional to the wavelength of the illuminating light: the longer the wavelength, the poorer the resolution. Unlike a photon of light, which has a constant wavelength, the wavelength of an electron varies with the speed at which the particle is traveling, which in turn depends on the accelerating voltage applied in the microscope. This relationship is defined by the equation

practical limit of resolution of standard TEMs is in the range of 3 to 5 Å. The actual limit when observing cellular structure is more typically in the range of 10 to 15 Å. Electron microscopes consist largely of a tall, hollow cylindrical column through which the electron beam passes and a console containing a panel of dials that electronically control the operation in the column. The top of the column contains the cathode, a tungsten wire filament that is heated to provide a source of electrons. Electrons are drawn from the hot filament and accelerated as a fine beam by the high voltage applied between the cathode and anode. Air is pumped out of the column prior to operation, producing a vacuum through which the electrons travel. If the air were not removed, electrons would be prematurely scattered by collision with gas molecules. Just as a beam of light rays can be focused by a ground glass lens, a beam of negatively charged electrons can be focused by electromagnetic lenses, which are located in the wall of the column. The strength of the magnets is controlled by the current provided them, which is determined by the positions of the various dials of the console. A comparison of the lens systems of a light and electron microscope is shown in Figure 18.12. The condenser lenses of an electron microscope are placed between the electron source and the specimen, and they focus the electron beam on the specimen. The specimen is supported on a small, thin metal grid (3 mm diameter) that is inserted with tweezers into a grid holder, which in turn is inserted into the column of the microscope. Because the focal lengths of the lenses of an electron microscope vary depending on the current supplied to them, one objective lens provides the entire range of magnification delivered by the instrument. As in the light microscope, the image from the objective lens serves as the object for an additional lens system. The image provided by the objective lens of an electron microscope is only magnified about 100 times, but

l ⫽ 2150/V where ␭ is the wavelength in angstroms and V is the accelerating voltage in volts. Standard TEMs operate with a voltage range from 10,000 to 100,000 V. At 60,000 V, the wavelength of an electron is approximately 0.05 Å. If this wavelength and the numerical aperture attainable with the light microscope were substituted into the equation on page 734, the limit of resolution would be about 0.03 Å. In actual fact, the resolution attainable with a standard transmission electron microscope is about two orders of magnitude less than its theoretical limit. This is due to serious spherical aberration of electron-focusing lenses, which requires that the numerical aperture of the lens be made very small (generally between 0.01 and 0.001). The

18.2 Transmission Electron Microscopy

Figure 18.11 A comparison between the information contained in images taken by a light and electron microscope at a comparable magnification of 4500 times actual size. (a) A photo of skeletal muscle tissue that had been embedded in plastic, sectioned at 1 ␮m, and photographed with a light microscope under an oil immersion objective lens. (b) An adjacent section to that used for part a that was cut at 0.025 ␮m and examined under the electron microscope at comparable magnification to that in a. The resulting image displays a one- to

742

electron scattering and obtain required contrast, tissues are fixed and stained with solutions of heavy metals (described below). These metals penetrate into the structure of the cells and become selectively complexed with different parts of the organelles. Those parts of a cell that bind the greatest number of metal atoms allow passage of the least number of electrons. The fewer electrons that are focused on the screen at a given spot, the darker that spot. Photographs of the image are made by lifting the viewing screen out of the way and allowing the electrons to strike a photographic plate in position beneath the screen. Because photographic emulsions are directly sensitive to electrons, much as they are to light, an image of the specimen can be recorded directly on film. Alternatively, the image can be captured on a CCD video camera (page 738) used to monitor the field of view. Although a video camera provides an instant image without the need for chemical development, the image lacks the exceptionally high resolution that is available with film. The analogue image captured on film can be converted to a digital image by a process of digitization.

Viewing screen with final image

Ocular or projector lens Intermediate image

Objective lens Specimen

Condenser lens

Lamp Light microscope

Filament Electron microscope

Chapter 18 Techniques in Cell and Molecular Biology

Figure 18.12 A comparison of the lens systems of a light and electron microscope. (REPRINTED FROM ALAN W. AGAR, PRINCIPLES AND PRACTICE OF ELECTRON MICROSCOPE OPERATION, ELSEVIER/NORTH-HOLLAND, 1974. USED WITH PERMISSION FROM ELSEVIER LTD.)

unlike the light microscope, there is sufficient detail present in this image to magnify it an additional 10,000 times. By altering the current applied to the various lenses of the microscope, magnifications can vary from about 1000 times to 250,000 times. Electrons that have passed through the specimen are brought to focus on a phosphorescent screen situated at the bottom of the column. Electrons striking the screen excite a coating of fluorescent crystals, which emit their own visible light that is perceived by the eye as an image of the specimen. Image formation in the electron microscope depends on differential scattering of electrons by parts of the specimen. Consider a beam of electrons emitted by the filament and focused on the screen. If no specimen were present in the column, the screen would be evenly illuminated by the beam of electrons, producing an image that is uniformly bright. By contrast, if a specimen is placed in the path of the beam, some of the electrons strike atoms in the specimen and are scattered away. Electrons that bounce off the specimen cannot pass through the very small aperture at the back focal plane of the objective lens and are, therefore, lost as participants in the formation of the image. The scattering of electrons by a part of the specimen is proportional to the size of the nuclei of the atoms that make up the specimen. Because the insoluble material of cells consists of atoms of relatively low atomic number—carbon, oxygen, nitrogen, and hydrogen—biological material possesses very little intrinsic capability to scatter electrons. To increase

Specimen Preparation for Electron Microscopy As with the light microscope, tissues to be examined in the electron microscope must be fixed, embedded, and sectioned. Fixation of tissue for electron microscopy (Figure 18.13) is much more critical than for light microscopy because the sections are subjected to much greater scrutiny. A fixative must stop the life of the cell without significantly altering the structure of that cell. At the level of resolution of the electron microscope, relatively minor damage, such as swollen mitochondria or ruptured endoplasmic reticulum, becomes very apparent. To obtain the most rapid fixation and the least cellular damage, very small pieces of tissue (less than 1.0 mm3) are fixed and embedded. Fixatives are chemicals that denature and precipitate cellular macromolecules. Chemicals having such action may cause the coagulation or precipitation of materials that had no structure in the living cell, leading to the formation of an artifact. The best argument that a particular structure is not an artifact is the demonstration of its existence in cells fixed in a variety of different ways or, even better, not fixed at all. To view cells that have not been fixed, the tissue is rapidly frozen, and special techniques are utilized to reveal its structure (see cryofixation and freeze-fracture replication, described later). The most common fixatives for electron microscopy are glutaraldehyde and osmium tetroxide. Glutaraldehyde is a 5-carbon compound with an aldehyde group at each end of the molecule. The aldehyde groups react with amino groups in proteins and cross-link the proteins into an insoluble network. Osmium is a heavy metal that reacts primarily with fatty acids leading to the preservation of cellular membranes. Once the tissue has been fixed, the water is removed by dehydration in alcohol, and the tissue spaces are filled with a material that supports tissue sectioning. The demands of electron microscopy require the sections to be very thin. The wax sections cut for light microscopy are rarely thinner than about 5 ␮m, whereas sections for conventional electron microscopy

743 Wash

Wash

Small piece of tissue Tissue placed in (1 mm3) placed in fixative a second fixative (e.g. glutaraldehyde) (e.g. OsO4)

70% ethanol

95% ethanol

100% ethanol

Propylene oxide

Dehydration

Infiltration in a Tissue being embedded solution of plastic in plastic medium embedding medium contained within a vial (e.g. unpolymerized Epon)

Ultramicrotome Tissue block Knife edge

Block containing tissue is trimmed to prepare for sectioning Tissue is sliced into sections approximately 100 nm thick as block moves down across the sharp edge of a glass or diamond knife. Sections float in a trough of water just behind the knife edge.

Plastic in vial polymerizes into a solid block with tissue at the bottom edge of the block

Grid

Close-up view of sections in a ribbon floating in trough

EM grid containing sections ready to be stained with heavy metals, placed in a grid holder and examined in the electron microscope

Drop of heavy metal stain

Sections are stained

Figure 18.13 Preparation of a specimen for observation in the electron microscope.

treated with metal-tagged antibodies or other materials that react with specific molecules in the tissue section. Studies with antibodies are usually carried out on tissues embedded in acrylic resins, which are more permeable to large molecules than epoxy resins. Cryofixation and the Use of Frozen Specimens Cells and tissues do not have to be fixed with chemicals and embedded in plastic resins in order to be observed with the electron microscope. Alternatively, cells and tissues can be rapidly frozen. Just as a chemical fixative stops metabolic processes and preserves biological structure, so too does rapid freezing, which is called cryofixation. Because cryofixation accomplishes these goals without altering the cell’s macromolecules, it is less likely to lead to the formation of artifacts. The major difficulty

18.2 Transmission Electron Microscopy

are best when cut at less than 0.1 ␮m (equivalent in thickness to about four ribosomes). Tissues to be sectioned for electron microscopy are usually embedded in epoxy resins, such as Epon or Araldite. Sections are cut by slowly bringing the plastic block down across an extremely sharp cutting edge (Figure 18.13) made of cut glass or a finely polished diamond face. The sections coming off the knife edge float onto the surface of a trough of water that is contained just behind the knife edge. The sections are then picked up with the metal specimen grid and dried onto its surface. The tissue is stained by floating the grids on drops of heavy metal solutions, primarily uranyl acetate and lead citrate. These heavy metal atoms bind to macromolecules and provide the atomic density required to scatter the electron beam. In addition to the standard stains, tissue sections can be

Chapter 18 Techniques in Cell and Molecular Biology

744

with cryofixation is the formation of ice crystals, which grow outward from sites where nucleation occurs. As an ice crystal grows, it destroys the fragile contents of the cell in which it develops. The best way to avoid ice crystal formation is to freeze the specimen so rapidly that crystals don’t have time to grow. It is as if the water is frozen yet remains in its liquid state. Water in this state is said to be “vitrified.” There are several techniques employed to achieve such ultrarapid freezing rates. Smaller specimens are typically plunged into a liquid of very low temperature (such as liquid propane, boiling point of ⫺42⬚C). Larger specimens are best treated by high-pressure freezing. In this technique, the specimen is subjected to high hydrostatic pressure and sprayed with jets of liquid nitrogen. High pressure lowers the freezing point of water, reducing the rate of ice crystal growth. You might not think that a frozen block of tissue would be of much use to a microscopist, but a surprising number of approaches can be taken to visualize frozen cellular structure in the light or electron microscope. For example, after suitable preparation, a frozen block of tissue can be sectioned with a special microtome in a manner similar to that of a paraffin or plastic block of tissue. Frozen sections (cryosections) can be prepared for examination under either a light or an electron microscope. Frozen sections are particularly useful for studies on enzymes, whose activities tend to be denatured by chemical fixatives. Because frozen sections can be prepared much more rapidly than paraffin or plastic sections, they are often employed by pathologists to examine the light microscopic structure of tissues removed during surgery. As a result, the determination as to whether or not a tumor is malignant can be made while the patient is still on the operating table. Frozen cells do not have to be sectioned to reveal internal structure. Figure 1.11 shows an image of the thin, peripheral region of an intact cell that had been crawling over the surface of an electron microscope grid an instant before it was rapidly frozen. Unlike the standard electron micrograph, the image in Figure 1.11 has a three dimensionality—like the specimen itself—because it was generated by a computer rather than directly by a camera. To obtain the image, the computer aligns a large number of two-dimensional digital images of the cell that are captured as the specimen is tilted at defined angles relative to the axis of the electron beam. The three-dimensional, computational reconstruction is called a tomogram, and the technique is called cryoelectron tomography (cryo-ET ).1 CryoET, which was developed by Wolfgang Baumeister of the Max-Planck Institute in Germany, has revolutionized the way in which nanosized intracellular structures can be studied in unfixed, fully hydrated, flash-frozen cells. Cryo-ET can also be utilized to examine the three-dimensional organization of structures present in vitro, as exemplified by the reconstruction of an isolated polysome in the act of translation that is shown in Figure 11.52b. Cryo-ET, which can deliver a resolution of a few nms, provides an important bridge between the 1

This technique is similar in principle to computerized axial tomography (CAT scans), which uses a multitude of X-ray images taken at different angles to the body to generate a three-dimensional image. Fortunately, the machinery used in radiological tomography allows the X-ray source and detector to rotate while the patient can remain stationary.

(a)

(b)

Figure 18.14 Examples of negatively stained and metal-shadowed specimens. Electron micrographs of a tobacco rattle virus after negative staining with potassium phosphotungstate (a) or shadow casting with chromium (b). (COURTESY OF M. K. CORBETT.)

cellular and molecular worlds. Two other approaches that require the electron microscopic analysis of frozen specimens— freeze fracture replication and single particle analysis—are discussed on pages 745 and 759. Negative Staining The electron microscope is also well suited for examining very small particulate materials, such as viruses, ribosomes, multisubunit enzymes, cytoskeletal elements, and protein complexes. The shapes of individual proteins and nucleic acids can also be resolved as long as they are made to have sufficient contrast from their surroundings. One of the best ways to make such substances visible is to employ negative staining procedures in which heavy-metal deposits are collected everywhere on a specimen grid except where the particles are present. As a result, the specimen stands out by its relative brightness on the viewing screen. Examples of negatively stained specimens are shown in Figures 2.39a and 18.14a. Shadow Casting Another technique to visualize isolated particles is to have the objects cast shadows. The technique is described in Figure 18.15. The grids containing the specimens are placed in a sealed chamber, which is then evacuated by vacuum pump. The chamber contains a filament composed of

745 Evaporation of metal from platinum wire Evacuated chamber

Specimen Specimen

Specimen grid

Liquid freon Liquid N2

Specimen support Bell jar Cold knife

Specimen

Figure 18.15 The procedure used for shadow casting as a means to provide contrast in the electron microscope. This procedure is often used to visualize small particles, such as the viruses shown in the previous figure. DNA and RNA molecules are often made visible by a modification of this procedure known as rotary shadowing in which the metal is evaporated at a very low angle while the specimen is rotated.

a heavy metal (usually platinum) together with carbon. The filament is heated to high temperature, causing it to evaporate and deposit a metallic coat over accessible surfaces within the chamber. As a result, the metal is deposited on those surfaces facing the filament, while the opposite surfaces of the particles and the grid space in their shadow remain uncoated. When the grid is viewed in the electron microscope, the areas in shadow appear bright on the viewing screen, whereas the metal-coated regions appear dark. This relationship is reversed on the photographic plate, which is a negative of the image. The convention for illustrating shadowed specimens is to print a negative image in which the particle appears illuminated by a bright, white light (corresponding to the coated surface) with a dark shadow cast by the particle (Figure 18.14b). The technique provides excellent contrast and produces a three-dimensional effect.

Fracturing

Etching

Shadowing and replicating Carbon layer Replica viewed in electron microscope

Metal layer

Figure 18.16 Procedure for the formation of freeze-fracture replicas as described in the text. Freeze etching is an optional step in which a thin layer of covering ice is evaporated to reveal additional information about the structure of the fractured specimen.

18.2 Transmission Electron Microscopy

Freeze-Fracture Replication and Freeze Etching As noted above, a number of electron microscopic techniques have been adapted to work with frozen tissues. The ultrastructure of frozen cells is often viewed using the technique of freezefracture replication, which is illustrated in Figure 18.16. Small pieces of tissue are placed on a small metal disk and rapidly frozen. The disk is then mounted on a cooled stage within a vacuum chamber, and the frozen tissue block is struck by a knife edge. The resulting fracture plane spreads out from the point of contact, splitting the tissue into two pieces, not unlike the way that an axe blade splits a piece of wood in two. Consider what might happen as a fracture plane spreads through a cell containing a variety of organelles of different composition. These structures tend to cause deviations in the fracture plane, either upward or downward, giving the fracture face elevations, depressions, and ridges that reflect the contours

Knife edge

Chapter 18 Techniques in Cell and Molecular Biology

746

of the protoplasm traversed. Consequently, the surfaces exposed by the fracture contain information about the contents of the cell. The goal is to make this information visible. The replication process accomplishes this by using the fractured surface as a template on which a heavy-metal layer is deposited. The heavy metal is deposited onto the newly exposed surface of the frozen tissue in the same chamber where the tissue was fractured. The metal is deposited at an angle to provide shadows that accentuate local topography (Figure 18.17), as described in the previous section on shadow casting. A carbon layer is then deposited on top of the metal layer to cement the patches of metal into a solid surface. Once this cast has been made, the tissue that provided the template can be thawed, removed, and discarded; it is the metal–carbon replica that is placed on the specimen grid and viewed in the electron microscope. Variations in thickness of the metal in different parts of the replica cause variations in the numbers of penetrating electrons to reach the viewing screen, producing the necessary contrast in the image. As discussed in Chapter 4, fracture planes take the path of least resistance through the frozen block, which often carries them through the center of cellular membranes. As a result, this technique is particularly well suited for examining the distribution of integral membrane proteins as they span the lipid bilayer (as in Figures 4.15, 7.30, 7.31, and 7.32). Such studies carried out by Daniel Branton and others played an important role in the formulation of the fluid mosaic structure of cellular membranes in the early 1970s (page 124). Freeze-fracture replication by itself is an extremely valuable technique, but it can be made even more informative by including a step called freeze etching (Figure 18.16). In this step, the frozen, fractured specimen, while still in place within the cold chamber, is exposed to a vacuum at an elevated temperature for one to a few minutes, during which a layer of ice can evaporate (sublime) from the exposed surface. Once some of the ice has been removed, the surface of the structure can be coated with heavy metal and carbon to create a metallic replica that reveals both the external surface and internal structure of cellular membranes. The development by John Heuser of deep-etching techniques, in which greater amounts of surface ice are removed, led to a fascinating look at cellular organelles. Examples of specimens prepared by this technique are shown in Figures 18.18, 8.38, and 9.46, where it can be seen that the individual parts of the cell stand out in deep relief against the background. The technique delivers very high resolution and can be used to reveal the structure and the distribution of macromolecular complexes, such as those of the cytoskeleton, as they are presumed to exist within the living cell.

18.3 | Scanning Electron and Atomic Force Microscopy The TEM has been exploited most widely in the examination of the internal structure of cells. In contrast, the scanning electron microscope (SEM) is utilized primarily to examine the surfaces of objects ranging in size from a virus to an animal

N

N.P.

N.E.

G

G

V

C.W.

Figure 18.17 Replica of a freeze-fractured onion root cell showing the nuclear envelope (N.E.) with its pores (N.P.), the Golgi complex (G), a cytoplasmic vacuole (V), and the cell wall (C.W.). (COURTESY OF DANIEL BRANTON, HARVARD UNIVERSITY.)

Figure 18.18 Deep etching. Electron micrograph of ciliary axonemes from the protozoan Tetrahymena. The axonemes were fixed, frozen, and fractured, and the frozen water at the surface of the fractured block was evaporated away, leaving a portion of the axonemes standing out in relief, as visualized in this metal replica. The arrow indicates a distinct row of outer dynein arms. (FROM URSULA W. GOODENOUGH AND JOHN E. HEUSER, J. CELL BIOL. 95:800, 1982, FIG. 3, RIGHT PHOTO. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

747

(b)

Figure 18.19 Scanning electron microscopy. Scanning electron micrographs of (a) a T4 bacteriophage (⫻275,000) and (b) the head of an insect (⫻40). (A: FROM A. N. BROERS, B. J. PANESSA, AND J. F. GENNARO, SCIENCE 189:635, 1975; © 1975. REPRINTED WITH PERMISSION FROM AAAS; B: COURTESY OF H. F. HOWDEN AND L. E. C. LING.) (a)

by electrons that are reflected back from the specimen (backscattered) or by secondary electrons given off by the specimen after being struck by the primary electron beam. These electrons strike a detector that is located near the surface of the specimen. Image formation in the SEM is indirect. In addition to the beam that scans the surface of the specimen, another electron beam synchronously scans the face of a cathode-ray tube, producing an image similar to that seen on a television screen. The electrons that bounce off the specimen and reach the detector control the strength of the beam in the cathode-ray tube. The more electrons collected from the specimen at a given spot, the stronger the signal to the tube and the greater the intensity of the beam on the screen at the corresponding spot. The result is an image on the screen that reflects the surface topology of the specimen because it is this topology (the crevices, hills, and pits) that determines the number of electrons collected from the various parts of the surface. As evident in the micrographs of Figure 18.19, an SEM can provide a great range of magnification (from about 15 to 150,000 times for a standard instrument). Resolving power of an SEM is related to the diameter of the electron beam. Newer models are capable of delivering resolutions of less than 5 nm, which can be used to localize gold-labeled antibodies bound to a cell’s surface. The SEM also provides remarkable depth of focus, approximately 500 times that of the light microscope at a corresponding magnification. This property gives SEM images their three-dimensional quality. At the

18.3 Scanning Electron and Atomic Force Microscopy

head (Figure 18.19). The construction and operation of the SEM are very different from that of the TEM. The goal of specimen preparation for the SEM is to produce an object that has the same shape and surface properties as the living state, but is devoid of fluid, as required for observing the specimen under vacuum. Because water constitutes such a high percentage of the weight of living cells and is present in association with virtually every macromolecule, its removal can have a very destructive effect on cell structure. When cells are simply air dried, destruction results largely from surface tension at air–water interfaces. Specimens to be examined in the SEM are fixed, passed through a series of alcohols, and then dried by a process of critical-point drying. Critical-point drying takes advantage of the fact that a critical temperature and pressure exist for each solvent at which the density of the vapor is equal to the density of the liquid. At this point, there is no surface tension between the gas and the liquid. The solvent of the cells is replaced with a liquid transitional fluid (generally carbon dioxide), which is vaporized under pressure so that the cells are not exposed to any surface tension that might distort their three-dimensional configuration. Once the specimen is dried, it is coated with a thin layer of metal, which makes it suitable as a target for an electron beam. In the TEM, the electron beam is focused by the condenser lenses to simultaneously illuminate the entire viewing field. In the SEM, electrons are accelerated as a fine beam that scans the specimen. In the TEM, electrons pass through the specimen to form the image. In the SEM, the image is formed

748

cellular level, the SEM allows the observer to appreciate the structure of the outer cell surface and all of the various processes, extensions, and extracellular materials that interact with the environment.

Atomic Force Microscopy Although it is not an electron microscope, the atomic force microscope (AFM) is a high-resolution scanning instrument that is becoming increasingly important in nanotechnology and molecular biology. Several AFM-derived images can be found in the text, including Figures 4.24a, 5.25, 7.32c, 9.7, and 9.52. The AFM operates by scanning a sharp, microsized tip (probe) over the surface of the specimen. In one type of AFM, the probe is attached to a tiny oscillating beam (or cantilever), whose frequency of oscillations changes as the tip encounters variations in the topography of the specimen. These changes in oscillation of the beam can be converted into a three-dimensional topographic image of the surface of the specimen. Unlike other techniques for the determination of molecular structure, such as X-ray crystallography and cryoEM, which average the structure of many individual molecules, AFM provides an image of each individual molecule as it is oriented in the field (see Figure 5.25). Another limitation of electron microscopic and X-ray crystallographic technologies is that they are only able to provide static images or “snapshots.” The development of high-speed AFM (HS-AFM),

Laser beam

AFM position detector

AFM tip and cantilever

Chapter 18 Techniques in Cell and Molecular Biology

Sample

z-direction piezo actuator

Figure 18.20 High-speed atomic force microscopy. The basic elements of an HS-AFM are shown in this schematic drawing. The sample is mounted on a component (the piezo actuator) that is attached to the microscope stage (not shown). Signals from the actuator cause the cantilever to oscillate up and down so that the AFM tip intermittently contacts the sample as it scans over the sample’s surface. Forces that develop between the sample and AFM tip cause deflections of the tip, which are detected by a laser beam that is reflected off of the back of the AFM tip. Movements in the position of the laser beam, which reflect topographical differences along the sample surface, are recognized by a position detector and this information is used to construct an image of the specimen. The AFM-based movie of the movement of a myosin V molecule shown in Figure 9.52 was “filmed” at approximately 7 frames per second. (C. VEIGEL AND C. P. SCHMIDT, NATURE REVS. MOL. CELL BIOL. 12:166, 2011; COPYRIGHT 2011. NATURE REVIEWS MOLECULAR CELL BIOLOGY BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

which is shown schematically in Figure 18.20, now allows researchers to obtain rapid sequential images of a macromolecule so that its activity can be followed over real time. This technique is elegantly illustrated in the micrographs of Figure 9.52, which depict the steps taken by a single myosin V molecule as it moves along an actin filament. The opportunity to watch a “movie” of an individual protein as it carries on its normal activities is a remarkable achievement in a field that has been studying the structure of these molecules for more than half a century. The probe of an AFM can be used as more than a monitoring device; it can also be employed as a “nanomanipulator” to push or pull on the specimen in an attempt to measure various mechanical properties. This capability is illustrated in Figure 9.7 where it is shown that a single intermediate filament can be stretched to several times its normal length. In a different protocol, the AFM tip can be coated with ligands for a particular receptor, and measurements can be made of the affinity of that receptor for the ligand in question. Because of its many potential uses, the AFM has been referred to as “a lab on a tip.”

18.4 | The Use of Radioisotopes A tracer is a substance that reveals its presence in one way or another and thus can be localized or monitored during the course of an experiment. Depending on the substance and the type of experiment, a tracer might be fluorescently labeled, spin labeled, density labeled, or radioactively labeled. In each case, a labeled group enables a molecule to be detected without affecting the specificity of its interactions. Radioactive molecules, for example, participate in the same reactions as nonradioactive species, but their location can be followed and the amount present can be measured. The identity of an atom (whether it be an iron atom, a chlorine atom, or some other type), and thus its chemical properties, is determined by the number of positively charged protons in its nucleus. All hydrogen atoms have a single proton, all helium atoms have two protons, all lithium atoms have three protons, and so forth. Not all hydrogen, helium, or lithium atoms, however, have the same number of neutrons. Atoms having the same number of protons and different numbers of neutrons are said to be isotopes of one another. Even hydrogen, the simplest element, can exist as three different isotopes, depending on whether the atom has 0, 1, or 2 neutrons in its nucleus. Of these three isotopes of hydrogen, only the one containing two neutrons is radioactive; it is tritium (3H). Isotopes are radioactive when they contain an unstable combination of protons and neutrons. Atoms that are unstable tend to break apart, or disintegrate, thus achieving a more stable configuration. When an atom disintegrates, it releases particles or electromagnetic radiation that can be monitored by appropriate instruments. Radioactive isotopes occur throughout the periodic table of elements, and they can be produced from nonradioactive elements in the laboratory. Many biological

749

2 A Curie is that amount of radioactivity required to yield 3.7 ⫻ 1010 disintegrations per second.

Table 18.1 Properties of a Variety of Radioisotopes Used in Biological Research Symbol and atomic weight 3

H C 14 C 24 Na 32 P 35 S 42 K 45 Ca 59 Fe 60 Co 64 Cu 65 Zn 131 I 11

Half-life

Type of particle(s) emitted

12.3 yr 20 min 5700 yr 15.1 hr 14.3 d 87.1 d 12.4 hr 152 d 45 d 5.3 yr 12.8 hr 250 d 8.0 d

Beta Beta Beta Beta, Gamma Beta Beta Beta, Gamma Beta Beta, Gamma Beta, Gamma Beta, Gamma Beta, Gamma Beta, Gamma

activate the emulsion that coats a piece of film. If the photographic emulsion is brought into close contact with a radioactive source, the particles emitted by the source leave tiny, black silver grains in the emulsion after photographic development. Autoradiography is used to localize radioisotopes within tissue sections that have been immobilized on a slide or TEM grid. The steps involved in the preparation of a light microscopic autoradiograph are shown in Figure 18.21. The emulsion is applied to the sections on the slide or grid as a very thin overlying layer, and the specimen is put into a lightproof container to allow the emulsion to be exposed by the emissions. The longer the specimen is left before development, the greater the number of silver grains that are formed. When the developed slide or grid is examined in the microscope, the location of the silver grains in the layer of emulsion just above the tissue indicates the location of the radioactivity in the underlying cells. An electron microscopic micrograph is shown in Figure 8.3a.

18.5 | Cell Culture Throughout this book we have stressed the approach in cell biology that attempts to understand particular processes by their analysis in a simplified, controlled, in vitro system. The same approach can be applied to the study of cells themselves because they too can be removed from the influences they are normally subject to within a complex multicellular organism. Learning to grow cells outside the organism, that is, in cell culture, has proved to be one of the most valuable technical achievements in the entire study of biology. A quick glance through any journal in cell biology reveals that the majority of the articles describe research carried out on cultured cells. The reasons for this are numerous: cultured cells can be obtained in large quantity; most cultures contain only a single type of cell; many different cellular activities, including endocytosis, cell

18.5 Cell Culture

molecules can be purchased in a radioactive state, that is, containing one or more radioactive atoms as part of their structure. The half-life (t1/2) of a radioisotope is a measure of its instability. The more unstable a particular isotope, the greater the likelihood that a given atom will disintegrate in a given amount of time. If one starts with one Curie2 of tritium, half that amount of radioactive material will be left after approximately 12 years (which is the half-life of this radioisotope). In the early years of research into photosynthesis and other metabolic pathways, the only available radioisotope of carbon was 11C, which has a half-life of approximately 20 minutes. Experiments with 11C were literally carried out on the run so that the amount of incorporated isotope could be measured before the substance disappeared. The availability in the 1950s of 14C, a radioisotope having a half-life of 5700 years, was greeted with great celebration. The radioisotopes of greatest importance in cell biological research are listed in Table 18.1, along with information on their half-lives and the nature of their radiation. Three main forms of radiation can be released by atoms during their disintegration. The atom may release an alpha particle, which consists of two protons and two neutrons and is equivalent to the nucleus of a helium atom; a beta particle, which is equivalent to an electron; and/or gamma radiation, which consists of electromagnetic radiation or photons. The most commonly used isotopes are beta emitters, which are monitored by either of two different methodologies: liquid scintillation spectrometry or autoradiography. Liquid scintillation spectrometry is used to measure the amount of radioactivity in a given sample. The technique is based on the property of certain molecules, termed phosphors or scintillants, to absorb some of the energy of an emitted particle and release that energy in the form of light. In preparing a sample for liquid scintillation counting, one mixes the sample with a solution of the phosphor in a glass or plastic scintillation vial. This brings the phosphor and radioactive isotope into very close contact so that radiation from even the weakest beta emitters can be efficiently measured. Once mixed, the vial is placed into the counting instrument, where it is lowered into a well whose walls contain an extremely sensitive photodetector. As radioactive atoms within the vial disintegrate, the emitted particles activate the scintillants, which emit flashes of light. The light is detected by a photocell, and the signal is amplified by a photomultiplier tube within the counter. After correcting for background noise, the amount of radioactivity present in each vial is displayed on a printout. Autoradiography is a broad-based technique used to determine where a particular isotope is located, whether in a cell, in a polyacrylamide gel, or on a nitrocellulose filter. The importance of autoradiography in early discoveries on the synthetic activities of cells was described in the pulse-chase experiments on page 273. Autoradiography takes advantage of the ability of a particle emitted from a radioactive atom to activate a photographic emulsion, much like light or X-rays

750 Figure 18.21 Steps taken during the preparation of a light microscopic autoradiograph. Dehydrate and embed cells in wax or plastic. Section wax or plastic block.

Wash and fix cells

Incubate cells in radioactive compound

Plastic section Cells

Cells in fixative

Slide

Slide

Liquid emulsion Section Dipping vessel

Store slides in dark box until ready to process

Dip slides into radiationsensitive emulsion in the darkroom

Develop Top view

Side view Layer of emulsion

Silver grains over cell

Background silver grain

Slide

Silver grain

Section

Cell

Chapter 18 Techniques in Cell and Molecular Biology

Slide after processing

movement, cell division, membrane trafficking, and macromolecular synthesis, can be studied in cell culture; cells can differentiate in culture; and cultured cells respond to treatment with drugs, hormones, growth factors, and other active substances. The early tissue culturists employed media containing a great variety of unknown substances. Cell growth was accomplished by adding fluids obtained from living systems, such as lymph, blood serum, or embryo homogenates. It was found that cells required a considerable variety of nutrients, hormones, growth factors, and cofactors to remain healthy and proliferate. Even today, most culture media contain large amounts of serum. The importance of serum (or the growth factors it contains) on the proliferation of cultured cells is shown in the cell-growth curves in Figure 16.4. One of the primary goals of cell culturists has been to develop defined, serum-free media that supports the growth of cells. Using a pragmatic approach in which combinations of various ingredients are tested for their ability to support cell growth and proliferation, a growing number of cell types have been successfully cultured in “artificial” media that lack serum or other natural fluids. As would be expected, the composition of these chemically defined media is relatively complex; they consist of a mixture of nutrients and vitamins, together with a variety of purified proteins, including insulin, epidermal

growth factor, and transferrin (which provides the cells with iron). Because they are so rich in nutrients, tissue culture media are a very inviting habitat for the growth of microorganisms. To prevent bacteria from contaminating cell cultures, tissue culturists must go to great lengths to maintain sterile conditions within their working space. This is accomplished by using sterile gloves, sterilizing all supplies and instruments, employing low levels of antibiotics in the media, and conducting activities within a sterile hood. The first step in cell culture is to obtain the cells. In most cases, one need only remove a vial of frozen, previously cultured cells from a tank of liquid nitrogen, thaw the vial, and transfer the cells to the waiting medium. A culture of this type is referred to as a secondary culture because the cells are derived from a previous culture. In a primary culture, on the other hand, the cells are obtained directly from the organism. Most primary cultures of animal cells are obtained from embryos, whose tissues are more readily dissociated into single cells than those of adults. Dissociation is accomplished with the aid of a proteolytic enzyme, such as trypsin, which digests the extracellular domains of proteins that mediate cell adhesion (Chapter 7). The tissue is then washed free of the enzyme and usually suspended in a saline solution that lacks Ca2⫹ ions and contains a substance, such as ethylenediamine tetraacetate

751

(a)

(b)

nonflattened, spindle-shaped morphology. The extracellular fibronectin matrix appears in blue. Bar equals 10 ␮m. (FROM EDNA CUKIERMAN, CELL 130:603, 2007, FIG. 1, REPRINTED WITH PERMISSION FROM ELSEVIER. COURTESY OF KENNETH M. YAMADA, NATIONAL INSTITUTE OF DENTAL AND CRANIOFACIAL RESEARCH, NIH.)

(EDTA), that binds (chelates) calcium ions. As discussed in Chapter 7, calcium ions play a key role in cell–cell adhesion, and their removal from tissues greatly facilitates the separation of cells. Normal (nonmalignant) cells can divide a limited number of times (typically 50 to 100) before they undergo senescence and death (page 508). Because of this, many of the cells that are commonly used in tissue culture studies have undergone genetic modifications that allow them to be grown indefinitely. Cells of this type are referred to as a cell line, and they typically grow into malignant tumors when injected into susceptible laboratory animals. The frequency with which a normal cell growing in culture spontaneously transforms into a cell line depends on the organism from which it was derived. Mouse cells, for example, transform at a relatively high frequency; human cells transform only rarely, if ever. Human cell lines (e.g., HeLa cells) are typically derived from human tumors or from cells treated with cancer-causing viruses or chemicals. A number of laboratories are moving away from traditional two-dimensional culture systems, where cells are grown on the flat surface of a culture dish, to three-dimensional cultures, in which cells are grown in a three-dimensional matrix consisting of synthetic and/or natural extracellular materials. These materials can be purchased as products containing proteins and other components derived from natural basement membranes (Figures 7.4 and 7.8). Cells within the body do not live on flat, hard-plastic surfaces, and, consequently, it is thought that three-dimensional matrices provide a much

more natural environment for cultured cells. Consequently, the morphology and behavior of cells in these three-dimensional environments are more like those of cells observed within the tissues of the body. This is reflected in differences in the cytoskeletal organization, types of cell adhesions, signaling activities, and states of differentiation of cells in the two types of culture systems. In addition, 3D culture systems are also better suited to study cell–cell interactions because they allow cells to come into contact with one another at any point along their surface. Figure 18.22 shows images of human fibroblasts grown on either a flat surface or within a three-dimensional matrix. The cell in Figure 18.22a has pressed itself against the substratum, creating a highly unnatural upper (dorsal) and lower (ventral) surface with unusually broad and flattened lamellipodia. In contrast, the same type of cell grown in 3D culture (Figure 18.22b) has a typical spindle-shaped structure without any obvious surface differentiation. Many different types of plant cells can also grow in culture. In one approach, plant cells are treated with the enzyme cellulase, which digests away the surrounding cell wall, releasing the naked cell, or protoplast. Protoplasts can then be grown in a chemically defined medium that promotes their growth and division. Under suitable conditions, the cells can grow into an undifferentiated clump of cells, called a callus, which can be induced to develop shoots from which a plant can regenerate. In an alternate approach, the cells in leaf tissue can be induced by hormone treatment to lose their differentiated properties and develop into callus material. The callus can then be transferred to liquid media to start a cell culture.

18.5 Cell Culture

Figure 18.22 A comparison of the morphology of cells growing in 2D versus 3D cultures, (a) A human fibroblast growing on a flat, fibronectin-coated substratum in 2D culture assumes a highly flattened shape with broad lamellipodia. Integrin-containing adhesions appear white. (b) The same type of cell growing in a 3D matrix assumes a

752

Chapter 18 Techniques in Cell and Molecular Biology

18.6 | The Fractionation of a Cell’s Contents by Differential Centrifugation Most cells contain a variety of different organelles. If one sets out to study a particular function of mitochondria or to isolate a particular enzyme from the Golgi complex, it is useful to first isolate the relevant organelle in a purified state. Isolation of a particular organelle in bulk quantity is generally accomplished by the technique of differential centrifugation, which depends on the principle that, as long as they are more dense than the surrounding medium, particles of different size and shape travel toward the bottom of a centrifuge tube at different rates when placed in a centrifugal field. To carry out this technique, cells are first broken open by mechanical disruption, typically using a mechanical homogenizer. Cells are homogenized in an isotonic buffered solution (often containing sucrose), which prevents the rupture of membrane vesicles due to osmosis. The homogenate is then subjected to a series of sequential centrifugations at increasing centrifugal forces. The steps in this technique were discussed in Chapter 8 and illustrated in Figure 8.5. Initially, the homogenate is subjected to low centrifugal forces for a short period of time so that only the largest cellular organelles, such as nuclei (and any remaining whole cells), are sedimented into a pellet. At greater centrifugal forces, relatively large cytoplasmic organelles (mitochondria, chloroplasts, lysosomes, and peroxisomes) can be spun out of suspension (Figure 8.5). In subsequent steps, microsomes (the fragments of vacuolar and reticular membranes of the cytosol) and ribosomes are removed from suspension. This last step requires the ultracentrifuge, which can generate speeds of 75,000 revolutions per minute, producing forces equivalent to 500,000 times that of gravity. Once ribosomes have been removed, the supernatant consists of the cell’s soluble phase and those particles too small to be removed by sedimentation. The initial steps of differential centrifugation do not yield pure preparations of a particular organelle, so that further steps are usually required. In many cases, further purification

Resuspended material from 20,000g pellet

1 molar sucrose

2 molar sucrose

65,000 g/2 hr

Sucrose density gradient

Lysosomes filled with Triton WR1339 (1.12 gram/ml) Mitochondria (1.18 gram/ml) Peroxisomes (1.23 gram/ml)

Figure 18.23 Purification of subcellular fractions by densitygradient equilibrium centrifugation. In this particular example, the medium is composed of a continuous sucrose-density gradient, and the different organelles sediment until they reach a place in the tube equal to their own density, where they form bands. The 20,000g pellet is obtained as shown in Figure 8.5.

is accomplished by centrifugation of one of the fractions through a density gradient, as shown in Figure 18.23, which distributes the contents of the sample into various layers according to their density. The composition of various fractions can be determined by microscopic examination, or by measuring the amounts of particular proteins known to be specific for particular organelles. Cellular organelles isolated by differential centrifugation retain a remarkably high level of normal activity, as long as they are not exposed to denaturing conditions during their isolation. Organelles isolated by this procedure can be used in cell-free systems to study a wide variety of cellular activities, including the synthesis of membrane-bound proteins (page 276), the formation of transport vesicles (page 293), DNA synthesis (page 559), and the transport of solutes and development of ionic gradients (see Figure 5.23).

18.7 | Isolation, Purification, and Fractionation of Proteins During the course of this book, we have considered the properties of many different proteins. Before information about the structure or function of a particular protein can be obtained, that protein must be isolated in a relatively pure state. Because most cells contain thousands of different proteins, the purification of a single species can be a challenging mission, particularly if the protein is present in the cell at low concentration. In this section, we will briefly survey a few of the techniques used to purify proteins. Purification of a protein is generally performed by the stepwise removal of contaminants. Two proteins may be very similar in one property, such as overall charge, and very different in another property, such as molecular size or shape. Consequently, the complete purification of a given protein usually requires the use of successive techniques that take advantage of different properties of the proteins being separated. Purification is measured as an increase in specific activity, which is the ratio of the amount of that protein to the total amount of protein present in the sample. Some identifiable feature of the specific protein must be utilized as an assay to determine the relative amount of that protein in the sample. If the protein is an enzyme, its catalytic activity may be used as an assay to monitor purification. Alternatively, assays can be based on immunologic, electrophoretic, electron microscopic, or other criteria. Measurements of total protein in a sample can be made using various properties, including total nitrogen, which can be very accurately measured and is quite constant at about 16 percent of the dry weight for all proteins.

Selective Precipitation The first step in purification should be one that can be carried out on a highly impure preparation and yield a large increase in specific activity. The first step usually takes advantage of solubility differences among proteins by selectively precipitating

753

Liquid Column Chromatography

Ion-Exchange Chromatography Proteins are large, polyvalent electrolytes, and it is unlikely that many proteins in a partially purified preparation have the same overall charge.

3

Liquid chromatography is distinguished from gas chromatography in which the mobile phase is represented by an inert gas.

Mixture of + and – charged proteins Beads of

+ charged

– + – +

+ + – – + – + + + + + + + +

+

+ – + – + – + + – + + +

+

+ + + + + + + + + + + + + +

+ + + + + + + + + + + + + +

+ + + + + + + + + + + + + +

+ +

+ + + + + + +

+ + – – – – –

DEAEcellulose

+

1

+

+ + + + + + +

–

Fraction number

Figure 18.24 Ion-exchange chromatography. The separation of two proteins by DEAE-cellulose. In this case, a positively charged ion-exchange resin is used to bind the negatively charged protein.

18.7 Isolation, Purification, and Fractionation of Proteins

Chromatography is a term for a variety of techniques in which a mixture of dissolved components is fractionated as it moves through some type of porous matrix. In liquid chromatographic techniques, components in a mixture can become associated with one of two alternative phases: a mobile phase, consisting of a moving solvent, and an immobile phase, consisting of the matrix through which the solvent is moving.3 In the chromatographic procedures described below, the immobile phase consists of materials that are packed into a column. The proteins to be fractionated are dissolved in a solvent and then passed through the column. The materials that make up the immobile phase contain sites to which the proteins in solution can bind. As individual protein molecules interact with the materials of the matrix, their progress through the column is retarded. Thus the greater the affinity of a particular protein for the matrix material, the slower its passage through the column. Because different proteins in the mixture have different affinity for the matrix, they are retarded to different degrees. As the solvent passes through the column and drips out the bottom, it is collected as fractions in a series of tubes. Those proteins in the mixture with the least affinity for the column appear in the first fractions to emerge from the column. The resolution of many chromatographic procedures has been improved with the development of high performance liquid chromatography (HPLC), in which long, narrow columns are used, and the mobile phase is forced under high pressure through a tightly packed noncompressible matrix composed of exceptionally small (e.g., 5 ␮m diameter) particles.

The overall charge of a protein is a summation of all the individual charges of its component amino acids. Because the charge of each amino acid depends on the pH of the medium (see Figure 2.27), the charge of each protein also depends on the pH. As the pH is lowered, negatively charged groups become neutralized and positively charged groups become more numerous. The opposite occurs as the pH is increased. A pH exists for each protein at which the total number of negative charges equals the total number of positive charges. This pH is the isoelectric point, at which the protein is neutral. The isoelectric point of most proteins is below pH 7. Ionic charge is used as a basis for purification in a variety of techniques, including ion-exchange chromatography. Ion-exchange chromatography depends on the ionic bonding of proteins to an inert matrix material, such as cellulose, containing covalently linked charged groups. Two of the most commonly employed ion-exchange resins are diethylaminoethyl (DEAE) cellulose and carboxymethyl (CM) cellulose. DEAE-cellulose is positively charged and therefore binds negatively charged molecules; it is an anion exchanger. CM-cellulose is negatively charged and acts as a cation exchanger. The resin is packed into a column, and the protein solution is allowed to percolate through the column in a buffer whose composition promotes the binding of some or all of the proteins to the resin. Proteins are bound to the resin reversibly and can be displaced by increasing the ionic strength of the buffer (which adds small ions to compete with the charged groups of the macromolecules for sites on the resin) and/or changing its pH. Proteins are eluted from the column in order from the least strongly bound to the most strongly bound. Figure 18.24 shows a schematic representation of the separation of two protein species by stepwise elution from an ionexchange column.

Amount of protein

the desired protein. The solubility properties of a protein are determined largely by the distribution of hydrophilic and hydrophobic side chains on its surface. A protein’s solubility in a given solution depends on the relative balance between protein–solvent interactions, which keep it in solution, and protein–protein interactions, which cause it to aggregate and precipitate from solution. The salt most commonly employed for selective protein precipitation is ammonium sulfate, which is extremely soluble in water and has high ionic strength. Purification is achieved by gradually adding a solution of saturated ammonium sulfate to the crude protein extract. As addition of salt continues, precipitation of contaminating proteins increases, and the precipitate can be discarded. Ultimately, a point is reached at which the protein being sought comes out of solution. This point is recognized by the loss of activity in the soluble fraction when tested by the particular assay being used. Once the desired protein is precipitated, contaminating proteins are left behind in solution, while the protein being sought can be redissolved.

754 Protein to be purified (e.g., insulin receptor)

Mixture of 3 proteins

Ligand (e.g., insulin)

Agarose bead (a)

Porous beads

Figure 18.25 Gel filtration chromatography. The separation of three globular proteins having different molecular mass, as described in the text. Among proteins of similar basic shape, larger molecules are eluted before smaller molecules.

Mixture of protein sought ( ) and contaminant ( )

Chapter 18 Techniques in Cell and Molecular Biology

(b)

Gel Filtration Chromatography Gel filtration separates proteins (or nucleic acids) primarily on the basis of their effective size (hydrodynamic radius). Like ion-exchange chromatography, the separation material consists of tiny beads that are packed into a column through which the protein solution slowly passes. The materials used in gel filtration are composed of cross-linked polysaccharides (dextrans or agarose) of different porosity, which allow proteins to diffuse in and out of the beads. The technique is best illustrated by example (Figure 18.25). Suppose one is attempting to purify a globular protein having a molecular mass of 125,000 daltons. This protein is present in solution with two contaminating proteins of similar shape, one much larger at 250,000 daltons and the other much smaller at 75,000 daltons. One way the protein might be purified is to pass the mixture through a column of Sephadex G-150 beads, which allows entry to globular proteins that are less than about 200 kDa. When the protein mixture passes through the column bed, the 250 kDa protein is unable to enter the beads and remains dissolved in the moving solvent phase. As a result, the 250 kDa protein is eluted as soon as the preexisting solvent in the column (the bed volume) has dripped out. In contrast, the other two proteins can diffuse into the interstices within the beads and are retarded in their passage through the column. As more and more solvent moves through the column, these proteins move down its length and out the bottom, but they do so at different rates. Among those proteins that enter the beads, smaller species are retarded to a greater extent than larger ones. Consequently, the 125-kDa protein is eluted in a purified state, while the 75-kDa protein remains in the column. Affinity Chromatography Techniques described up to this point utilize the bulk properties of a protein to effect purification or fractionation. Another purification technique called affinity chromatography takes advantage of the unique

Figure 18.26 Affinity chromatography. (a) Schematic representation of the coated agarose beads to which only a specific protein can combine. (b) Steps in the chromatographic procedure.

structural properties of a protein, allowing one protein species to be specifically withdrawn from solution while all others remain behind in solution (Figure 18.26). Proteins interact with specific substances: enzymes with substrates, receptors with ligands, antigens with antibodies, and so forth. Each of these types of proteins can be removed from solution by passing a mixture of proteins through a column in which the specific interacting molecule (substrate, ligand, antibody, etc.) is covalently linked to an inert, immobilized material (the matrix). If, for example, an impure preparation of an acetylcholine receptor is passed through a column containing agarose beads to which an acetylcholine analogue is attached, the receptor binds specifically to the beads as long as the conditions in the column are suitable to promote the interaction (page 172). Once all the contaminating proteins have passed through the column and out the bottom, the acetylcholine receptor molecules can be displaced from their binding sites on the matrix by changing the ionic composition and/or pH of the solvent in the column. Thus, unlike the other chromatographic procedures that separate proteins on the basis of size or charge, affinity chromatography can achieve a near-total purification of the desired molecule in a single step. Determining Protein–Protein Interactions One of the ways to learn about the function of a protein is to identify the proteins with which it interacts. Several techniques are available for determining which proteins in a cell might be capable of interacting with a given protein that has already been identified. One of these techniques was just discussed: affinity

755 Activation domain of transcription factor DNA-binding domain of transcription factor

Transcription lacZ gene

Promoter (a)

X

DNA-binding domain fused to protein X No transcription

(b)

Activation domain fused to protein Y

Y No transcription (c)

Activation domain fused to protein Y

X

Y

DNA-binding domain fused to protein X Transcription

(d)

Activation domain fused to protein Z

DNA-binding domain fused to protein X

X

Z No transcription

(e)

Figure 18.27 Use of the yeast two-hybrid system. This test for protein–protein interaction depends on a cell being able to put together two parts of a transcription factor. (a) The two parts of the transcription factor—the DNA-binding domain and the activation domain—are seen here as the transcription factor binds to the promoter of a gene ( lacZ) encoding ␤-galactosidase. (b) In this case, a yeast cell has synthesized the DNA-binding domain of the transcription factor linked to a known “bait” protein X. This complex cannot activate transcription. (c) In this case, a yeast cell has synthesized the activation domain of the transcription factor linked to an unknown “fish” protein Y. This complex cannot activate transcription. (d ) In this case, a yeast cell has synthesized both X and Y protein constructs, which reconstitutes a complete transcription factor, allowing lacZ expression, which is readily detected. (e) If the second DNA had encoded a protein, for example, Z, that could not bind to X, no expression of the reporter gene would have been detected.

18.7 Isolation, Purification, and Fractionation of Proteins

chromatography. Another technique uses antibodies. Consider, for example, that protein A, which has already been identified and purified, is part of a complex with two other proteins in the cytoplasm, proteins B and C. Once protein A has been purified, an antibody against this protein can be obtained and used as a probe to bind and remove protein A from solution. If a cell extract is prepared that contains the A–B–C protein complex, and the extract is incubated with the anti-A antibody, then binding of the antibody to the A protein will result in the coprecipitation of other proteins that are bound to A, in this case proteins B and C, which can then be identified. Coprecipitation of DNA fragments using the ChIP technique was illustrated in Figure 12.41. The technique most widely used to search for protein–protein interactions is the yeast two-hybrid system, which was invented in 1989 by Stanley Fields and Ok-kyu Song at the State University of New York in Stony Brook. The technique is illustrated in Figure 18.27 and depends on the expression of a reporter gene, such as ␤-galactosidase (lacZ), whose activity is readily monitored by a test that detects a color change when the enzyme is present in a population of yeast cells. Expression of the lacZ gene in this system is activated by a particular protein—a transcription factor— that contains two domains, a DNA-binding domain and an activation domain (Figure 18.27a). The DNA-binding domain mediates binding to the promoter, and the activation domain mediates interaction with other proteins involved in the activation of gene expression. Both domains must be present for transcription to occur. To employ the technique, two different types of recombinant DNA molecules are prepared. One DNA molecule contains a segment of DNA encoding the DNA-binding domain of the transcription factor linked to a segment of DNA encoding the “bait” protein (X). The bait protein is the protein that has been characterized and the one for which potential binding partners are being sought. When this recombinant DNA is expressed in a yeast cell, a hybrid protein such as that depicted in Figure 18.27b is produced in the cell. The other DNA molecule contains a portion of the transcription factor encoding the activation domain linked to DNA encoding an unknown protein (Y ). Such DNAs (or cDNAs as they are called) are prepared from mRNAs by reverse transcriptase as described on page 774. Let’s assume that Y is a protein capable of binding to the bait protein. When a recombinant DNA encoding Y is expressed in a yeast cell, a hybrid protein such as that depicted in Figure 18.27c is produced in the cell. When produced in a cell alone, neither the X- nor Y-containing hybrid protein is capable of activating transcription of the lacZ gene (Figure 18.27b,c). However, if both of these particular recombinant DNA molecules are introduced into the same yeast cell (as in Figure 18.27d), the X and Y proteins can interact with one another to reconstitute a functional transcription factor, an event that can be detected by the cell’s ability to produce ␤-galactosidase. Using this technique, researchers are able to “fish” for proteins encoded by unknown genes that are capable of interacting with the “bait” protein. The Y2H methodology is particularly well-suited for screening large numbers of proteins and its use in large-scale (high-throughput) proteomic studies is

756

discussed on page 62. In recent years, the two-hybrid approach pioneered in yeast has been adapted for use in mammalian cells. To date, studies using mammalian two-hybrid systems tend to be focused on the interactions of specific proteins rather than screening of large-scale protein libraries.

Sample with tracking dye being loaded into well Polyacrylamide gel slab positioned between two glass plates

Chapter 18 Techniques in Cell and Molecular Biology

Polyacrylamide Gel Electrophoresis Another powerful technique that is widely used to fractionate proteins is electrophoresis. Electrophoresis depends on the ability of charged molecules to migrate when placed in an electric field. The electrophoretic separation of proteins is usually accomplished using polyacrylamide gel electrophoresis (PAGE), in which the proteins are driven by an applied current through a gelated matrix. The matrix is composed of polymers of a small organic molecule (acrylamide) that is cross-linked to form a molecular sieve. A polyacrylamide gel may be formed as a thin slab between two glass plates or as a cylinder within a glass tube. Once the gel has polymerized, the slab (or tube) is suspended between two compartments containing buffer in which opposing electrodes are immersed. In a slab gel, the concentrated, protein-containing sample is layered in slots along the top of the gel, as shown in step 1 of Figure 18.28. The protein sample is prepared in a solution containing sucrose or glycerol, whose density prevents the sample from mixing with the buffer in the upper compartment. A voltage is then applied between the buffer compartments, and current flows across the slab, causing the proteins to move toward the oppositely charged electrode (step 2). Separations are typically carried out using alkaline buffers, which make the proteins negatively charged and cause them to migrate toward the positively charged anode at the opposite end of the gel. Following electrophoresis, the slab is removed from the glass plates and stained (step 3). The relative movement of proteins through a polyacrylamide gel depends on the charge density (charge per unit of mass) of the molecules. The greater the charge density, the more forcefully the protein is driven through the gel, and thus the more rapid its rate of migration. But charge density is only one important factor in PAGE fractionation; size and shape also play a role. Polyacrylamide forms a cross-linked molecular sieve that entangles proteins passing through the gel. The larger the protein, the more it becomes entangled, and the slower it migrates. Shape is also a factor because compact globular proteins move more rapidly than elongated fibrous proteins of comparable molecular mass. The concentration of acrylamide (and cross-linking agent) used in making the gel is also an important factor. The lower the concentration of acrylamide, the less the gel becomes cross-linked, and the more rapidly a given protein molecule migrates. A gel containing 5 percent acrylamide might be useful for separating proteins of 60 to 250 kDa, whereas a gel of 15 percent acrylamide might be useful for separating proteins of 10 to 50 kDa. The progress of electrophoresis is followed by watching the migration of a charged tracking dye that moves just ahead of the fastest proteins (step 2, Figure 18.28). After the tracking dye has moved to the desired location, the current is turned off, and the gel is removed from its container. Gels are

1

Upper reservoir with buffer Negative electrode (cathode)

Tracking dye shows extent of electrophoretic movement Positive electrode (anode) 2

+

Lower reservoir with buffer

_

Power source

Larger

Smaller

Staining solution in tray

3

Gel

Figure 18.28 Polyacrylamide gel electrophoresis. The protein samples are typically dissolved in a sucrose solution whose density prevents the sample from mixing with the buffer and then loaded into the wells with a fine pipette as shown in step 1. In step 2, a direct current is applied across the gel, which causes the proteins to move into the polyacrylamide along parallel lanes. When carried out in the detergent SDS, which is usually the case, the proteins move as bands at rates that are inversely proportional to their molecular mass. Once electrophoresis is completed, the gel is removed from the glass frame and stained in a tray (step 3).

typically stained with Coomassie Blue or silver stain to reveal the locations of the proteins. If the proteins are radioactively labeled, their locations can be determined by pressing the gel

757

against a piece of X-ray film to produce an autoradiograph, or the gel can be sliced into fractions and individual proteins isolated. Alternatively, the proteins in the gel can be transferred by a second electrophoretic procedure to a nitrocellulose membrane to form a blot (page 763). Proteins become absorbed onto the surface of the membrane in the same relative positions they occupied in the gel. In a Western blot, the proteins on the membrane are identified by their interaction with specific antibodies. SDS–PAGE Polyacrylamide gel electrophoresis (PAGE) is usually carried out in the presence of the negatively charged detergent sodium dodecyl sulfate (SDS), which binds in large numbers to all types of protein molecules (page 133). The electrostatic repulsion between the bound SDS molecules causes the proteins to unfold into a similar rod-like shape, thus eliminating differences in shape as a factor in separation. The number of SDS molecules that bind to a protein is roughly proportional to the protein’s molecular mass (about 1.4 g SDS/g protein). Consequently, each protein species, regardless of its size, has an equivalent charge density and is driven through the gel with the same force. However, because the polyacrylamide is highly cross-linked, larger proteins are held up to a greater degree than smaller proteins. As a result, proteins become separated by SDS–PAGE on the basis of a single property—their molecular mass. In addition to separating the proteins in a mixture, SDS–PAGE can be used to determine the molecular mass of the various proteins by comparing the positions of the bands to those produced by

pH

50 40 30

20

15

Figure 18.29 Two-dimensional gel electrophoresis. A twodimensional polyacrylamide gel of HeLa cell nonhistone chromosomal proteins labeled with [35S]methionine. Over a thousand different proteins can be resolved by this technique. (FROM J. L. PETERSON AND E. H. MCCONKEY, J. BIOL. CHEM. 251:550, 1976. © 1976 THE AMERICAN SOCIETY FOR BIOCHEMISTRY AND MOLECULAR BIOLOGY.)

Two-Dimensional Gel Electrophoresis In 1975, a technique called two-dimensional gel electrophoresis was developed by Patrick O’Farrell at the University of California, San Francisco, to fractionate complex mixtures of proteins using two different properties of the molecules. Proteins are first separated in a tubular gel according to their isoelectric point (page 753) by a technique called isoelectric focusing. After separation, the gel is removed and placed on top of a slab of SDSsaturated polyacrylamide and subjected to SDS–PAGE. The proteins move into the slab gel and become separated according to their molecular mass (Figure 18.29). Once separated, individual proteins can be removed from the gel and digested into peptide fragments that can be analyzed by mass spectrometry. The resolution of the technique is sufficiently high to distinguish most of the proteins in a cell. Because of its great resolving power, two-dimensional gel electrophoresis is ideally suited to detect changes in the proteins present in a cell under different conditions, at different stages in development or the cell cycle, or in different organisms (see Figure 2.48). The technique, however, is not suitable for distinguishing among proteins that have high molecular mass, that are highly hydrophobic, or that are present at very low copy numbers per cell.

Protein Measurement and Analysis One of the simplest and most widely used methods to determine the amount of protein (or nucleic acid) present in a given solution is to measure the amount of light of a specific wavelength that is absorbed by that solution. The instrument used for this measurement is a spectrophotometer. To make this type of measurement, the solution is placed in a special, flatsided, quartz container (quartz is used because, unlike glass, it does not absorb ultraviolet light), termed a cuvette, which is then placed in the light beam of the spectrophotometer. The amount of light that passes through the solution unabsorbed (i.e., the transmitted light) is measured by photocells on the other side of the cuvette. Of the 20 amino acids incorporated into proteins, two of them, tyrosine and phenylalanine, absorb light in the ultraviolet range with an absorbance maximum at about 280 nm. If the proteins being studied have a typical percentage of these amino acids, then the absorbance of the solution at this wavelength provides a measure of protein concentration. Alternatively, one can use a variety of chemical assays, such as the Lowry or Biuret technique, in which the protein in solution is engaged in a reaction that produces a colored product whose concentration is proportional to the concentration of protein. Mass Spectrometry As discussed on page 71, the emerging field of proteomics depends heavily on the analysis of proteins by mass spectrometry. Mass spectrometers are analytical instruments used primarily to measure the masses of molecules, determine chemical formulas and molecular structure, and identify unknown substances. Mass spectrometers

18.7 Isolation, Purification, and Fractionation of Proteins

90 80 70 60

Molecular weight ⫻ 10–3

7.45 7.3 7.2 7.1 7.0 6.8 6.55 6.3 6.1 6.0 5.9

proteins of known size. Examples of SDS–PAGE are shown and on pages 146 and 173.

Chapter 18 Techniques in Cell and Molecular Biology

758

accomplish these feats by converting the substances in a sample into positively charged, gaseous ions, which are accelerated through a curved tube toward a negatively charged plate (Figure 18.30). As the ions pass through the tube, they are subjected to a magnetic field that causes them to separate from one another according to their molecular mass [or more precisely according to their mass-to-charge (m/z) ratio]. The ions strike an electronic detector located at the end of the tube. Smaller ions travel faster and strike the detector more rapidly than larger ions. The input to the detector is converted into a series of peaks of ascending m/z ratio (as in Figure 2.49). Mass spectrometers have been a favorite instrument of chemists for many years, but it has only been in the past two decades that biologists have discovered their remarkable analytic powers. Now, using mass spectrometry (MS), protein biochemists are able to rapidly identify the proteins present in a particular type of cell, organelle, or protein complex. To carry out this analysis, the protein sample is generally digested with trypsin, fractionated by liquid chromatography, and the peptides introduced into a mass spectrometer where they are gently ionized and made gaseous by one of two procedures. Development of these peptide ionization techniques has been the key in adapting MS to the study of proteins. In one procedure, called matrix-assisted laser desorption ionization (MALDI), the protein sample is applied as part of a crystalline matrix, which is irradiated by a laser pulse. The energy of the laser excites the matrix and the absorbed energy converts the peptides into gaseous ions. In an alternate procedure, called electrospray ionization (ESI), an electric potential is applied to a peptide solution, causing the peptides to ionize and the liquid to spray as a fine mist of charged particles that enter the spectrometer. Because it acts on molecules in solution, ESI is well suited to ionize peptides prepared from proteins fractionated by a widely employed liquid chromatography technique. Once the molecular masses of the peptides in the sample have been determined, the proteins can be identified by a database search as discussed on page 72. If each of the proteins cannot be identified unambiguously, one or more of the peptides generated by tryptic digestion can be fragmented4 in a second step and subjected to another round of mass spectrometry. This two-step procedure (called tandem MS, or MS/MS) yields the amino acid sequence of the peptides, which can then be assembled into the set of proteins from which they are derived. MS/MS is so powerful that complex mixtures of hundreds of unknown proteins can be digested and subjected to mass spectrometry and the identity of each of the proteins in the mixture determined. MS/MS can also be used to identify the specific posttranslational modifications that are present on a particular protein under a particular set of physiological conditions. 4

Fragmentation is accomplished within the mass spectrometer by collision of the peptides with an inert gas. The energy of collision breaks peptide bonds to produce a random collection of fragments of the original peptide. The amino acid sequence of each fragment, and hence that of the original peptide, can be determined by searching a database containing the masses of theoretical fragments having every possible sequence of amino acids that can be formed from the proteins encoded by that genome.

Positive ions formed in electrical discharge + Beam of

Detector

positive ions

–

N S Beam is divided into several beams, each containing ions of the same mass

– Magnet whose strength can be varied

Figure 18.30 Principles of operation of a mass spectrometer. (FROM J. E. BRADY, J. RUSSELL, AND J. R. HOLUM, CHEMISTRY, 3D ED.; COPYRIGHT © 2000, JOHN WILEY AND SONS, INC. REPRINTED WITH PERMISSION OF JOHN WILEY AND SONS, INC.)

18.8 | Determining the Structure of Proteins and Multisubunit Complexes X-ray crystallography (or X-ray diffraction) utilizes protein crystals, which are bombarded with a fine beam of X-rays (Figure 18.31). The radiation that is scattered (diffracted) by the electrons of the protein’s atoms strikes an electron-sensitive detector placed behind the crystal. The diffraction pattern produced by a crystal is determined by the structure within the protein; the large number of molecules in the crystal reinforces the reflections, causing it to behave as if it were one giant molecule. The positions and intensities of the reflections, such as those on the photographic plate of Figure 2.33, can be related mathematically to the electron densities within the protein, because it is the electrons of the atoms that produced them. The resolution obtained by X-ray diffraction depends on the number of spots that are analyzed.

Photographic film

3 2 1

X-ray beam 0 Incident X-ray source

–1 Crystal –2 Diffracted X-rays

–3

Figure 18.31 X-ray diffraction analysis. Schematic diagram of the diffraction of X-rays by atoms of one plane of a crystal onto a photographic plate. The ordered array of the atoms within the crystal produces a repeating series of overlapping circular waves that spread out and intersect the film. As with diffraction of visible light, the waves form an interference pattern, reinforcing each other at some points on the film and canceling one another at other points.

759

6-A resolution (a)

for membrane proteins, where it is difficult to obtain the three-dimensional crystals required for analysis. Structural analysis of these types of specimens is often conducted using an alternate technique that takes advantage of the tremendous resolving power of the electron microscope and computerbased image processing techniques. There are two general approaches to the study of single particles with the electron microscope. In one approach, the particles are placed on an electron microscope grid, negatively stained, and then air-dried (as discussed on page 744). In the alternate approach, which is referred to as electron cryomicroscopy, or cryo-EM, the particles are placed on a grid and rapidly frozen in a hydrated state in liquid nitrogen without being fixed or stained. For higher resolution studies, the use of negatively stained specimens has largely been replaced by frozenhydrated particles, which are much less likely to generate artifacts and are much better suited for the study of internal features within the particle’s structure. In either case, the grids are placed in the column of the microscope and photographs of the particles are taken. Each photograph is a two-dimensional image of an individual particle in the orientation that it happened to assume as it rested on the grid. When two-dimensional images of tens of thousands of different specimens in every conceivable orientation are averaged by high-powered computer analysis, a three-dimensional reconstruction of the particle at a resolution of ⬃10 Å can be generated. At this resolution, investigators can trace the polypeptide chain of a protein and even identify the location of bulky amino acid side chains. A model of a eukaryotic ribosome based on cryo-EM is illustrated in Figure 2.58, a model of a clathrin-coated vesicle in Figure 8.40c, and a model of a U1 snRNP particle in Figure 11.33. This technique, which is referred to as singleparticle reconstruction, is also useful for capturing images of a structure, such as a ribosome, at different stages during a dynamic process, such as the elongation step of protein synthesis. Using this approach, several of the major conformational changes that occur during each step of translation have been revealed. In addition, atomic-resolution structures determined by X-ray crystallography can be fitted into the lower resolution

2-A resolution (b)

Figure 18.32 Electron density distribution of a small organic molecule (diketopiperazine) calculated at several levels of resolution. At the lowest resolution (a), only the ring nature of the molecule can be distinguished, whereas at the highest resolution (d ), the electron density around each atom (indicated by the circular contour lines) is revealed.

1.5-A resolution (c)

1.1-A resolution (d)

(AFTER D. HODGKIN. REPRINTED WITH PERMISSION FROM NATURE 188:445, 1960; COPYRIGHT 1960, NATURE BY NATURE PUBLISHING GROUP. REPRODUCED WITH PERMISSION OF NATURE PUBLISHING GROUP IN THE FORMAT REUSE IN A BOOK/TEXTBOOK VIA COPYRIGHT CLEARANCE CENTER.)

18.8 Determining the Structure of Proteins and Multisubunit Complexes

Myoglobin was the first protein whose structure was determined by X-ray diffraction. The protein was analyzed successively at 6, 2, and 1.4 Å, with years elapsing between each completed determination. Considering that covalent bonds are between 1 and 1.5 Å in length and noncovalent bonds between 2.8 and 4 Å in length, the information gathered for a protein depends on the resolution achieved. This is illustrated by a comparison of the electron density of a small organic molecule at four levels of resolution (Figure 18.32). In myoglobin, a resolution of 6 Å was sufficient to show the manner in which the polypeptide chain is folded and the location of the heme moiety, but it was not sufficient to show structure within the chain. At a resolution of 2 Å, groups of atoms could be separated from one another, whereas at 1.4 Å, the positions of individual atoms were determined. To date, the structures of several hundred proteins have been determined at atomic resolution (⬍1.2 Å) and a few as low as 0.66 Å. Over the years, X-ray diffraction technology has improved greatly. It took Max Perutz 22 years to solve the structure of hemoglobin (Figure 2.38b), a task that today might take a few weeks. In most current studies: (1) intense, highly focused X-ray beams are generated by synchrotrons (see p. 100), which are high-energy particle accelerators that produce X-rays as a by-product, and (2) highly sensitive electronic detectors (charge-coupled devices, or CCDs) that provide a digital readout of the diffraction data have replaced photographic plates. Use of these instruments in conjunction with increasingly powerful computers allows researchers to collect and analyze sufficient data to determine the tertiary structure of most proteins in a matter of hours. As a result of these advances, X-ray crystallography has been applied to the analysis of larger and larger molecular structures. This is probably best illustrated by the success attained in the determination of the structure of the ribosome, which is discussed in Chapter 11. In most cases, as was true in the study of the ribosome, the greatest challenge in this field is obtaining usable crystals. X-ray crystallography is ideally suited for determining the structure of soluble proteins that lend themselves to crystallization, but can be very challenging for the study of complex multisubunit structures, such as ribosomes or proteasomes, or

760

relative to the electron beam. This technique is known as electron crystallography.

18.9 | Fractionation of Nucleic Acids Any systematic fractionation method must exploit differences between members of a mixture to effect separation. Nucleic acid molecules can differ from one another in overall size, base composition, topology, and nucleotide sequence. Accordingly, fractionation methods for nucleic acids are based on these features.

Separation of DNAs by Gel Electrophoresis

Chapter 18 Techniques in Cell and Molecular Biology

Figure 18.33 Combining data from electron microscopy and X-ray crystallography provides information on protein–protein interactions and the structure of multisubunit complexes. The electron microscopic reconstruction of an actin-ADF filament is shown in grey. The highresolution X-ray crystal structures of individual actin monomers (red) and ADF molecules (green) have been fitted into the lower resolution EM structure. (COURTESY OF EDWARD H. EGELMAN, UNIVERSITY OF VIRGINIA, J. CELL BIOL. 163:1059, 2003 FIG. 2A. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

electron microscopic reconstructions to show how the individual molecules that make up a multisubunit complex interact and how they might work together to carry out a particular activity. Figure 18.33 shows the tertiary structure of a filament composed of actin and ADF (a member of the cofilin family, page 373). The structures of the two proteins were determined by separate X-ray crystallographic studies and then fitted into an electron microscopic model of an actin-ADF filament. The reconstruction shown in this figure served as the basis for a proposed mechanism by which cofilin family proteins can induce severing and depolymerization of an actin filament (page 378). Electron microscopic analysis of frozen specimens can also be utilized in the study of membrane proteins, such as the nicotinic acetylcholine receptor (see the Experimental Pathway for Chapter 4). This type of analysis requires that the membrane proteins be closely packed at very low temperatures (e.g., ⫺195⬚C) into two-dimensional crystalline arrays within the plane of the membrane. This technique offers an advantage over X-ray crystallographic studies of membrane proteins in that the protein remains during the entire process within its own natural membrane rather than being extracted in detergent and crystallized in a nonmembranous environment. The structures illustrated on page 174 were determined from combined, high-resolution, electron-microscopic images of many different protein molecules taken at various angles

Of the various techniques used in the fractionation of proteins discussed earlier, one of them, gel electrophoresis, is also widely used to separate nucleic acids of different molecular mass (i.e., nucleotide length). Small RNA or DNA molecules of a few hundred nucleotides or less are generally separated by polyacrylamide gel electrophoresis. Larger molecules have trouble making their way through the cross-linked polyacrylamide and are generally fractionated on agarose gels, which have greater porosity. Agarose is a polysaccharide extracted from seaweed; it is dissolved in hot buffer, poured into a mold, and caused to gelate simply by lowering the temperature. Separation of DNA molecules greater than about 25 kb is generally accomplished using the technique of pulsed-field electrophoresis in which the direction of the electric field in the gel is periodically changed, which causes the DNA molecules to reorient themselves during their migration. Following electrophoresis, the DNA fragments in the gel are visualized by soaking the gel in a solution of stain such as ethidium bromide. Ethidium bromide intercalates into the double helix and causes the DNA bands to appear fluorescent when viewed under ultraviolet light (Figure 18.34). The sensitivity of gel electrophoresis is so great that DNA or RNA molecules that differ by only a single nucleotide can be separated from one another, a feature that gave rise to an invaluable method for DNA sequencing (page 771). Since the rate of migration through a gel can also be affected by the shape of the molecule, electrophoresis can be used to separate molecules with different conformations, such as circular and linear or relaxed and supercoiled forms (see Figure 10.12).

Separation of Nucleic Acids by Ultracentrifugation Common experience tells us that the stability of a solution (or suspension) depends on the components. Cream floats to the top of raw milk, a fine precipitate gradually settles to the bottom of a container, and a solution of sodium chloride remains stable indefinitely. Numerous factors determine whether or not a given component will settle through a liquid medium, including the size, shape, and density of the substance and the density and viscosity of the medium. If a component in a

761

Cell

DNA

Restriction enzyme digestion

Slot of mixture of restriction fragments

Direction of movement of DNA fragments during electrophoresis

DNA fragments

Gel electrophoresis Buffer solution

DNA fragments separated by size after electrophoresis Agarose gel slab Negative electrode

Positive Electrical electrode cable

Power supply

Gel is removed from electrophoresis apparatus after DNA fragments have been separated by size (shorter fragments migrate faster)

DNA fragments separated by size (a)

(b)

solution or suspension is denser than its medium, then centrifugal force causes it to become concentrated toward the bottom of a centrifuge tube. Larger particles sediment more rapidly

Velocity Sedimentation The rate at which a given molecule moves in response to centrifugal force in a centrifuge is known as its sedimentation velocity. Since the sedimentation velocity changes as the centrifugal force changes, a given molecule is characterized by a sedimentation coefficient, which is its sedimentation velocity divided by the force. Throughout this book we have referred to various macromolecules and their complexes as having a particular S value. The unit S (or Svedberg, after the inventor of the ultracentrifuge) is equivalent to a sedimentation coefficient of 10⫺13 sec. Because the velocity at which a particle moves through a liquid column depends on a number of factors, including shape, the determination of sedimentation coefficients does not by itself provide the molecular mass. However, as long as one is dealing with the same type of molecule, the S value provides a good measure of relative size. For example, the three ribosomal RNAs of E. coli, the 5S, 16S, and 23S molecules, have lengths of 120, 1600, and 3200 nucleotides, respectively. In velocity (or rate-zonal) sedimentation, nucleic acid molecules are separated according to nucleotide length (Figure 18.35a). The sample containing the mixture of nucleic acid molecules is carefully layered over a solution containing an increasing concentration of sucrose (or other suitable substance). This preformed gradient increases in density (and viscosity) from the top to the bottom. When subjected to high centrifugal forces, the molecules move through the gradient at a rate determined by their sedimentation coefficient. The greater the sedimentation coefficient, the farther a molecule moves in a given period of centrifugation. Because the density of the medium is less than that of the nucleic acid molecules, even at the bottom of the tube (approximately 1.2 g/ml for the sucrose solution and 1.7 g/ml for the nucleic acid), these molecules continue to sediment as long as the tube is centrifuged. In other words, centrifugation never reaches equilibrium. After a prescribed period, the tube is removed from the centrifuge, its contents are fractionated (as shown in Figure 18.35c), and the relative positions of the various molecules are determined. The presence of the viscous sucrose prevents the contents of the tube from becoming mixed due to either convection or handling, allowing molecules of identical S value to remain in place in the form of a band. If marker molecules of

18.9 Fractionation of Nucleic Acids

Figure 18.34 Separation of DNA restriction fragments by gel electrophoresis. (a) DNA is incubated with a restriction enzyme, which cuts it into fragments (page 764). The mixture of fragments is introduced into a slot, or well, in a slab of agarose and an electric current is applied. The negatively charged DNA molecules migrate toward the positive electrode and separate by size. (b) All of the DNA fragments that are present in a gel can be revealed by immersing the gel in a solution of ethidium bromide and then viewing the gel under an ultraviolet light. (B: PHILLIPE PLAILLY/SCIENCE PHOTO LIBRARY/ PHOTO RESEARCHERS, INC.)

than smaller particles of similar shape and density. The tendency for molecules to become concentrated during centrifugation is counteracted by the effects of diffusion, which causes the molecules to become redistributed in a more uniform (random) arrangement. With the development of ultracentrifuges, it has become possible to generate centrifugal forces greater than 500,000 times the force of gravity, which is sufficient to overcome the effects of diffusion and cause macromolecules to sediment toward the bottom of a centrifuge tube. Centrifugation proceeds in a near vacuum to minimize frictional resistance. DNA (and RNA) molecules have been extensively analyzed by techniques utilizing the ultracentifuge. For the present purpose, we will consider two centrifugation techniques used in studies of nucleic acids that have been discussed in this text. These techniques are illustrated in Figure 18.35.

762 Figure 18.35 Techniques of nucleic acid sedimentation. (a) Separation of different-sized DNA molecules by velocity sedimentation. The sucrose density gradient is formed within the tube (step 1) by allowing a sucrose solution of increasing concentration to drain along the wall of the tube. Once the gradient is formed, the sample is carefully layered over the top of the gradient (steps 2 and 3), and the tube is subjected to centrifugation (e.g., 50,000 rpm for 5 hours) as illustrated in step 4. The DNA molecules are separated on the basis of their size (step 5). (b) Separation of DNA molecules by equilibrium sedimentation on the basis of differences in base composition. The DNA sample is mixed with the CsCl solution (step 1) and subjected to extended centrifugation (e.g., 50,000 rpm for 72 hours). The CsCl gradient forms during the centrifugation (step 2), and the DNA molecules band in regions of equivalent density (step 3). (c) The tube from the experiment of b is punctured and the contents are allowed to drip into successive tubes, thereby fractionating the tube’s contents. The absorbance of the solution in each fraction is measured and plotted as shown.

Sucrose solution

Sample

5% sucrose 1

2

3

20% sucrose Small-size DNA molecules

5

4

Medium-size DNA molecules Large-size DNA molecules (a)

1

2

3

known sedimentation coefficient are present, the S values of unknown components can be determined. Experimental results obtained by sucrose-density gradient centrifugation are shown in Figures 11.13 and 11.17.

18.10 | Nucleic Acid Hybridization Nucleic acid hybridization includes a variety of related techniques that are based on the observation that two singlestranded nucleic acid molecules of complementary base sequence can form a double-stranded hybrid. Consider a situation

AT-rich DNA molecules GC-rich DNA molecules 1.75 gram/ml CsCI (b)

1.65 gram/ml CsCI AT-rich DNA molecules GC-rich DNA molecules 1.75 gram/ml CsCI

Absorbance (O.D.) at 260 nm

Chapter 18 Techniques in Cell and Molecular Biology

1.65 gram/ml CsCI

Equilibrium Centrifugation In the other type of centrifugation technique, equilibrium (or isopycnic) centrifugation (Figure 18.35b), nucleic acid molecules are separated on the basis of their buoyant density. In this procedure, one generally employs a highly concentrated solution of the salt of the heavymetal cesium. The analysis is begun by mixing the DNA with the solution of cesium chloride or cesium sulfate in the centrifuge tube and then subjecting the tube to extended centrifugation (e.g., 2 to 3 days at high forces). During the centrifugation, the heavy cesium ions are slowly driven toward the bottom of the tube, forming a continuous density gradient through the liquid column. After a time, the tendency for cesium ions to be concentrated toward the bottom of the tube is counterbalanced by the opposing tendency for them to become redistributed by diffusion, and the gradient becomes stabilized. As the cesium gradient is forming, individual DNA molecules are driven downward or move buoyantly upward in the tube until they reach a position that has a buoyant density equivalent to their own, at which point they are no longer subject to further movement. Molecules of equivalent density form narrow bands within the tube (see Figure 18.41). This technique is sensitive enough to separate DNA molecules having different base composition (as illustrated in Figure 18.35b) or ones containing different isotopes of nitrogen (15N vs. 14N, as shown in Figure 13.3b).

1

Bottom

2

3

4

5

6

7

8

Fraction number

9

10

11

12

13

Top

(c)

in which one has a mixture of hundreds of fragments of DNA of identical length and overall base composition that differ from one another solely in their base sequence. Assume, for example, that one of the DNA fragments constitutes a portion of the ␤-globin gene and all the other fragments contain unrelated genes. The only way to distinguish between the

763

fragment encoding the ␤-globin polypeptide and all the others is to carry out a molecular hybridization experiment using complementary molecules as probes. In the present example, incubation of the mixture of denatured DNA fragments with an excess number of ␤-globin mRNAs would drive the globin fragments to form doublestranded DNA–RNA hybrids, leaving the other DNA fragments single-stranded. There are a number of ways one could separate the DNA–RNA hybrids from the single-stranded fragments. For example, the mixture could be passed through a column of hydroxylapatite under ionic conditions in which the hybrids would bind to the calcium phosphate salts in the column, while the nonhybridized DNA molecules would pass through unbound. The hybrids could then be released from the column by increasing the concentration of the elution buffer. Experiments using nucleic acid hybridization require the incubation of two populations of complementary singlestranded nucleic acids under conditions (ionic strength, temperature, etc.) that promote formation of double-stranded molecules. Depending on the type of experiment being conducted, the two populations of reacting molecules may both be present in solution, or one of the populations may be immobilized, for example, by localization within a chromosome (as in Figure 10.19). In many cases, one of the populations of single-stranded nucleic acids to be employed in a hybridization experiment is

present within a gel. Consider a population of DNA fragments that have been prepared from genomic DNA and fractionated by gel electrophoresis (Figure 18.36). To carry out the hybridization, the gel is treated to render the DNA singlestranded. The single-stranded DNA is transferred from the gel to a nitrocellulose membrane and fixed onto the membrane by heating it to 80⬚C in a vacuum. The procedure by which DNA is transferred to a membrane is termed blotting. Once the DNA is attached, the membrane is incubated with a labeled, single-stranded DNA (or RNA) probe capable of hybridizing to a complementary group of fragments. Unbound probe is then washed away, and the location of the bound probe is determined autoradiographically as shown in Figure 18.36. The experiment just described and depicted in Figure 18.36 using a radioactive probe is called a Southern blot (named after Edwin Southern, its developer). One or a few DNA restriction fragments that contain a particular nucleotide sequence can be identified in a Southern blot, even if there are thousands of unrelated fragments present in the gel. An example of a Southern blot is shown in Figure 10.18. RNA molecules can also be separated by electrophoresis and identified with a labeled DNA probe after being blotted onto a membrane. An example of this procedure, which is called a Northern blot, is shown in Figure 11.35. DNA probes can be labeled in a variety of ways. A radioactive probe incorporates a radioactive isotope (such as

Nitrocellulose membrane

Electrophoretic gel

Labeled DNA or RNA probes

Autoradiogram

DNA fragment

Weights Glass plate

Stack of paper towels

Nitrocellulose membrane

Electrophoretic gel

Transfer buffer

Blotting procedure for transfer of DNA from gel to nitrocellulose membrane.

Figure 18.36 Determining the location of specific DNA fragments in a gel by a Southern blot. As described in the figure, the fractionated DNA fragments are washed out of the gel and trapped onto a nitrocellulose membrane, which is incubated with radioactively (or fluorescently) labeled DNA (or RNA) probes. The location of the hybridized fragments is determined autoradiographically (or

Nitrocellulose membrane with adsorbed DNA fragments following heat treatment that fixes the DNA to the membrane.

Incubate membrane with labeled DNA or RNA probes to allow hybridization, then wash and prepare autoradiogram.

Autoradiogram showing location of DNA fragments complementary to labeled probe.

microscopically if the probes are fluorescently labeled). During the blotting procedure, capillary action draws the buffer upward into the paper towels. As the buffer moves through the electrophoretic gel, it dissolves the DNA fragments and transfers them to the surface of the adjacent membrane.

18.10 Nucleic Acid Hybridization

Electrophoretic gel containing fractionated DNA fragments. The DNA is made single-stranded (denatured) by alkali treatment.

Sponge

764 32

P) at one or more locations in the molecule. The presence of the probe is detected by autoradiography as shown in Figure 18.36. Probes can also be labeled with fluorophores and detected by fluorescence. Another commonly used label is biotin, a small organic molecule that can be covalently linked to the DNA backbone. Biotin is detected by the protein avidin (or streptavidin), which binds tightly to it. The avidin itself must be labeled for detection, such as with a fluorophore as shown in Figures 10.19 and 10.20. Nucleic acid hybridization can also provide a measure of the similarity in nucleotide sequence between two samples of DNA, as might be obtained, for example, from two different organisms. The more distant the evolutionary relationship between the two species, the greater the divergence of their DNA sequences. If purified DNA from species A and B is mixed together, denatured, and allowed to reanneal, a percentage of the DNA duplexes are formed by DNA strands from the two species. Because they contain mismatched bases, such duplexes are less stable than those formed by DNA strands of the same species, and this instability is reflected in the lower temperature at which they melt. When DNAs from different species are allowed to reanneal in different combinations, the melting temperature (Tm, page 400) of the hybrid duplexes provides a measure of the evolutionary distance between the organisms. Two other important types of nucleic acid hybridization protocols are discussed in detail in the text: in situ hybridization on page 402 and hybridization to cDNA microarrays on page 515.

The chemical reactions that link nucleotides have been automated, and oligonucleotide synthesis is now carried out by computer-controlled machines hooked to reservoirs of reagents. The operator enters the desired nucleotide sequence into the computer and keeps the instrument supplied with materials. The oligonucletide is assembled one nucleotide at a time from the 3⬘ to the 5⬘ end of the molecule, up to a total of about 100 nucleotides. Modifications such as biotin and fluorophores can be incorporated into the molecules. If a double-stranded molecule is needed, it is synthesized as two complementary single strands that can be hybridized together. Longer synthetic molecules are made in segments which are joined together as illustrated by the experiment discussed on page 19.

18.12 | Recombinant DNA Technology Over the past 30 years, tremendous advances have been made in the analysis of eukaryotic genomes. This progress began as molecular biologists learned to construct recombinant DNA molecules, which are molecules containing DNA sequences derived from more than one source. Recombinant DNAs can be used in myriad ways. We will begin by considering one of the most important applications: the isolation from the genome of a particular segment of DNA that encodes a particular polypeptide. First, however, it is necessary to consider a class of enzymes whose discovery and use has made the formation of recombinant DNA molecules possible.

Restriction Endonucleases

Chapter 18 Techniques in Cell and Molecular Biology

18.11 | Chemical Synthesis of DNA Hybridization analysis requires single-stranded nucleic acid molecules for use as probes. Other fundamental techniques for the manipulation and analysis of DNA in the laboratory also require short single-stranded nucleic acid molecues, or oligonucleotides. Chemical synthesis of DNA and RNA is therefore a key supporting technology for many procedures. The development of chemical techniques to synthesize polynucleotides having a specific base sequence was begun by H. Gobind Khorana in the early 1960s as part of an attempt to decipher the genetic code. Khorana and co-workers continued to refine their techniques, and a decade after their initial work on the code, they succeeded in synthesizing a complete bacterial tyrosine tRNA gene, including the nontranscribed promoter region. The gene totaling 126 base pairs was put together from over 20 segments, each of which was individually synthesized and later joined enzymatically. This artificial gene was then introduced into bacterial cells carrying mutations for this tRNA, and the synthetic DNA was able to replace the previously deficient function. The first chemically synthesized gene encoding an average-sized protein, human interferon, was prepared in 1981, an effort that required the synthesis and assembly of 67 different fragments to produce a single duplex of 514 base pairs containing initiation and termination signals recognized by the bacterial RNA polymerase.

During the 1970s, it was found that bacteria contained nucleases that would recognize short nucleotide sequences within duplex DNA and cleave the DNA backbone at specific sites on both strands of the duplex. These enzymes are called type II restriction endonucleases, or simply restriction enzymes. They were given this name because they function in bacteria to destroy viral DNAs that might enter the cell, thereby restricting the growth of the viruses. The bacterium protects its own DNA from nucleolytic attack by methylating the bases at susceptible sites, a chemical modification that blocks the action of the enzyme. Enzymes from several hundred different prokaryotic organisms have been isolated that, together, recognize over 100 different nucleotide sequences. The sequences recognized by most of these enzymes are four to six nucleotides long and are characterized by a particular type of internal symmetry. Consider the particular sequence recognized by the enzyme EcoR1: 3′

CTTAAG

5′

5′

GAATTC

3′

This segment of DNA is said to have twofold rotational symmetry because it can be rotated 180⬚ without change in

765

base sequence. Thus, if one reads the sequence in the same direction (3⬘ to 5⬘ or 5⬘ to 3⬘) on either strand, the same order of bases is observed. A sequence with this type of symmetry is called a palindrome. When the enzyme EcoR1 attacks this palindrome, it breaks each strand at the same site in the sequence, which is indicated by the arrows between the A and G residues. The red dots indicate the methylated bases in this sequence that protect the host DNA from enzymatic attack. Some restriction enzymes cleave bonds directly opposite one another on the two strands producing blunt ends, whereas others, such as EcoR1, make staggered cuts.

The discovery and purification of restriction enzymes have been invaluable in the advances made by molecular biologists in recent years. Since a particular sequence of four to six nucleotides occurs quite frequently simply by chance, any type of DNA is susceptible to fragmentation by these enzymes. The use of restriction enzymes allows the DNA of the human genome, or that of any other organism, to be dissected into a precisely defined set of specific fragments. Once the DNA from a particular individual is digested with one of these enzymes, the fragments generated can be fractionated on the basis of length by gel electrophoresis (as in Figure 18.37a). Different enzymes cleave the same preparation of DNA into different

ORIGIN

ORIGIN

Hpall - 1 Hpall - 2 Hpall - 3 Hpall - 4 B A Hpall - 5 Hpall - 6

Hpall - 7

Hpall - 8

C D Hpall - 1+F

E Hpall - 1+ G H Hpall - 2

I K

Hpall - 3

J

Hpall - 4 BLUE

(a) 25

HpaII –2

50

HpaII –6

75

HpaII –1

HpaII –3

HpaII –5

HpaII HpaII –4 –8 –7 L M

C I

D H

F

J

B A

A E

E G (b)

G

18.12 Recombinant DNA Technology

EcoR I

Figure 18.37 The construction of a restriction map of the small circular genome of the DNA tumor virus polyoma. (a) Autoradiographs of 32 P-labeled DNA fragments that have been subjected to gel electrophoresis. The gel on the left shows the pattern of DNA fragments obtained after a complete digestion of the polyoma genome with the enzyme HpaII. To determine how these eight fragments are pieced together to make up the intact genome, it is necessary to treat the DNA in such a way that overlapping fragments are generated. Overlapping fragments can be produced by treating the intact genome with a second enzyme that cleaves the molecule at different sites, or by treating the genome with the same enzyme under conditions where the DNA is not fully digested as it was in the left gel. The two gels on the right represent examples of partial digests of the polyoma genome with HpaII. The middle gel shows the fragments generated by partial digestion of the superhelical circular DNA, and the gel on the right shows the HpaII fragments formed after the circular genome is converted into a linear molecule by EcoR1 (an enzyme that makes only one cut in the circle). (b) The restriction map of the linearized polyoma genome based on cleavage by HpaII. The eight fragments from the complete digest are shown along the DNA at the top. The overlapping fragments from the partial digest are shown in their ordered arrangement below the map. (Fragments L and M migrate to the bottom of the gel in part a, right side.) (FROM BEVERLY E. GRIFFIN, MIKE FRIED, AND ALLISON COWIE, PROC. NAT ’L. ACAD. SCI. U.S.A. 71:2078, 1974.)

766

sets of fragments, and the sites within the genome that are cleaved by various enzymes can be identified and ordered into a restriction map such as that depicted in Figure 18.37b.

Formation of Recombinant DNAs Recombinant DNAs can be formed in a variety of ways. In the method shown in Figure 18.38, DNA molecules from two different sources are treated with a restriction enzyme that makes staggered cuts in the DNA duplex. Staggered cuts leave short, single-stranded tails that act as “sticky ends” that can bind to a complementary single-stranded tail on another DNA molecule to restore a double-stranded molecule. In the example depicted in Figure 18.38, one of the DNA fragments

Human chromosome

Bacterial plasmid

Same restriction enzyme splits both types of DNA

that will make up the recombinant molecule is a bacterial plasmid. Plasmids are small, circular, double-stranded DNA molecules that are separate from the main bacterial chromosome. The other DNA fragment in Figure 18.38 is obtained from human cells following treatment with the same restriction enzyme used to open the plasmid. When the human DNA fragments and the treated plasmid are incubated together in the presence of DNA ligase, the two types of DNAs become hydrogen bonded to one another by their sticky ends and are then ligated to form circular DNA recombinants, as in Figure 18.38. The first recombinant DNA molecules were formed by this basic method in 1972 by Paul Berg, Herbert Boyer, Annie Chang, and Stanley Cohen of Stanford University and the University of California, San Francisco, marking the birth of modern genetic engineering. By following the procedure just described, a large number of different recombinant molecules are produced, each of which contains a bacterial plasmid with a segment of human DNA incorporated into its circular structure (see Figure 18.39). Suppose you were interested in isolating a single gene from the human genome, for example, the gene that codes for insulin. Because your goal is to obtain a purified preparation of the one type of recombinant DNA that contains the insulin-coding fragment, you must separate this one fragment from all of the others. This is done by a process called DNA cloning. We will return to the search for the insulin gene after describing the basic methodology behind DNA cloning.

DNA Cloning

A

A A

A TT

TT

TT A A

AATT

Chapter 18 Techniques in Cell and Molecular Biology

Human DNA fragment joins with plasmid by sticky ends; nicks are sealed by enzyme

Plasmid

Figure 18.38 Formation of a recombinant DNA molecule. In this example, a preparation of bacterial plasmids is treated with a restriction enzyme that makes a single cut within each bacterial plasmid. This same restriction enzyme is used to fragment a preparation of human genomic DNA into small fragments. Because they have been treated with the same restriction enzyme, the cleaved plasmid DNA and the human DNA fragments will have sticky ends. When these two populations are incubated together, the two DNA molecules become noncovalently joined to each other and are then covalently sealed by DNA ligase, forming a recombinant DNA molecule.

DNA cloning is a technique to produce large quantities of a specific DNA segment. The DNA segment to be cloned is first linked to a vector DNA, which is a vehicle for carrying foreign DNA into a suitable host cell, such as the bacterium E. coli. The vector contains sequences that allow it to be replicated within the host cell. Two types of vectors are commonly used to clone DNAs within bacterial hosts. In one approach, the DNA segment to be cloned is introduced into the bacterial cell by joining it to a plasmid, as described previously, and then causing the bacterial cells to take up the plasmid from the medium. In an alternate approach, the DNA segment is joined to a portion of the genome of the bacterial virus lambda (␭), which is then allowed to infect a culture of bacterial cells, producing large numbers of viral progeny, each of which contains the foreign DNA segment. Either way, once the DNA segment is inside a bacterium, it is replicated along with the bacterial (or viral) DNA and partitioned to the daughter cells (or progeny viral particles). Thus, the number of recombinant DNA molecules increases in proportion to the number of bacterial cells (or viral progeny) that are formed. From a single recombinant plasmid or viral genome inside a single bacterial cell, millions of copies of the DNA can be formed within a short period of time. Once the amount of DNA has been sufficiently amplified, the recombinant DNA can be purified and used in other procedures. In addition to providing a means to amplify the amount of a particular DNA sequence, cloning can also be used as a technique to isolate a pure form of any

767

specific DNA fragment among a large, heterogeneous population of DNA molecules. We will begin by discussing DNA cloning using bacterial plasmids.

E. coli chromosome Plasmid Antibiotic resistance gene

Purify human DNA

Purify plasmid DNA

Treat with EcoR1 to cleave both human and bacterial DNA into fragments

Join fragments into recombinant DNAs with DNA ligase

Population of plasmids containing different segments of human DNA Incubate E. coli cells under conditions in which they will take up plasmids from the medium. Grow cells in medium that selects for those containing a recombinant plasmid

Insulin gene

?

Plasmid-free E. coli

?

?

Ribosomal RNA gene

?

Figure 18.39 An example of DNA cloning using bacterial plasmids. DNA is extracted from human cells, fragmented with EcoR1, and the fragments are inserted into a population of bacterial plasmids. Techniques are available to prevent the formation of plasmids that lack a foreign DNA insert. Once a bacterial cell has picked up a recombinant plasmid from the medium, the cell gives rise to a colony of cells containing the recombinant DNA molecule. In this example, most of the bacteria contain eukaryotic DNAs of unknown function (labeled as ?), while one contains a portion of the DNA encoding ribosomal RNA, and another contains DNA encoding insulin.

18.12 Recombinant DNA Technology

Cloning Eukaryotic DNAs in Bacterial Plasmids The foreign DNA to be cloned is first inserted into the plasmid to form a recombinant DNA molecule. Plasmids used for DNA cloning are modified versions of those found in bacterial cells. Like the natural counterparts from which they are derived, these plasmids contain an origin of replication and one or more genes that make the recipient cell resistant to one or more antibiotics. Antibiotic resistance allows investigators to select for those cells that contain the recombinant plasmid. As first demonstrated by Avery, Macleod, and McCarty (page 422), bacterial cells can take up DNA from their medium. This phenomenon forms the basis for cloning plasmids in bacterial cells (Figure 18.39). In the most commonly employed technique, recombinant plasmids are added to a bacterial culture that has been pretreated with calcium ions. When subjected to a brief heat shock, such bacteria are stimulated to take up DNA from their surrounding medium. Usually only a small percentage of cells are competent to pick up and retain one of the recombinant plasmid molecules. Once it is taken up, the plasmid replicates autonomously within the recipient and is passed on to the progeny during cell division. Those bacteria containing a recombinant plasmid can be selected from the others by growing the cells in the presence of the antibiotic to which resistance is conferred by one or more genes on the plasmid. We began this discussion with the goal of isolating a small DNA fragment that contains the sequence for the insulin gene. To this point, we have formed a population of bacteria containing many different recombinant plasmids, very few of which contain the gene being sought (as in Figure 18.39). One of the great benefits of DNA cloning is that, in addition to producing large quantities of particular DNAs, it allows one to separate different DNAs from a mixture. It was noted above that plasmid-containing bacteria can be selected from those without plasmids by treatment with antibiotics. Once this has been done, the plasmid-bearing cells can be grown at low density on petri dishes so that all of the progeny of each cell (a clone of cells) remain physically separate from the progeny of other cells. Because a large number of different recombinant plasmids were present initially in the medium, the different cells plated on the dish contain different foreign DNA fragments. Once the cells containing the various plasmids have grown into separate colonies, the investigator can search through the colonies for the few that contain the gene being sought—in this case, the insulin gene. Culture dishes that contain bacterial colonies (or phage plaques) are screened for the presence of a particular DNA sequence using the combined techniques of replica plating and in situ hybridization. As illustrated in Figure 18.40a, replica plating allows one to prepare numerous dishes containing representatives of the same bacterial colonies in precisely the same position on each dish. One of the replica plates is then used to localize the DNA sequence being sought (Figure 18.40b), a procedure that requires that the cells be lysed and the DNA

768

Sterile filter paper or "velvet"

(a)

Figure 18.40 Locating a bacterial colony containing a desired DNA sequence by replica plating and in situ hybridization. (a) Once the bacterial cells that were plated on the culture dish have grown into colonies, the dish is inverted over a piece of filter paper, allowing some of the cells from each colony to become adsorbed to the paper. Empty culture dishes are then inoculated by pressing them against the filter paper to produce replica plates. (b) Procedure for screening the cells of a culture dish for those colonies that contain the recombinant DNA of interest. Cultures can be screened using either radioactively or fluorescently labeled probes. Once the relevant colonies are identified, cells can be removed from the original dish and grown separately to yield large quantities of the DNA fragments being sought.

Culture dish with bacterial colonies (or phage plaques) containing recombinant DNA

Bacterial colony

Transfer representative cells to nitrocellulose Bacterial colony membrane by replica plating technique

Position of labeled hybrid

Lyse cells and treat to denature DNA and Denatured DNA cause single-stranded adhering to membrane DNA to adhere to the filter in place

Incubate with radioactively or fluorescently labeled probe

Chapter 18 Techniques in Cell and Molecular Biology

(b)

Figure 18.41 Separation of plasmid DNA from that of the main bacterial chromosome by CsCl equilibrium centrifugation. This centrifugation tube can be seen to contain two bands, one of plasmid DNA carrying a foreign DNA segment that has been cloned within the bacteria, and the other containing chromosomal DNA from these same bacteria. The two types of DNA have been separated during centrifugation (Figure 18.35b). The researcher is removing the DNA from the tube with a needle and syringe. The DNA in the tube is made visible using the DNA-binding compound ethidium bromide, which fluoresces under ultraviolet light. (TED SPEIGEL/© CORBIS.)

fixed onto the surface of a nylon or nitrocellulose membrane. Once the DNA is fixed in place, the DNA is denatured to prepare it for in situ hybridization, during which the membrane is incubated with a labeled, single-stranded DNA probe containing the complementary sequence to that being sought. Following incubation, the unhybridized probe is washed from the membrane, and the location of the labeled hybrids is determined. Live representatives of the identified clones can be found at corresponding sites on the original plates, and these cells can be grown into large colonies, which serves to amplify the recombinant DNA plasmid. After the desired amount of amplification, the cells are harvested, the DNA is extracted, and the recombinant plasmid DNA is readily separated from the much larger chromosome by various techniques, including equilibrium centrifugation (Figure 18.41). The isolated recombinant plasmids can then be treated with the same restriction enzyme used in their formation, which releases the cloned DNA segments from the remainder of the DNA that served as the vector. The cloned DNA can then be separated from the plasmid by centrifugation. Cloning Eukaryotic DNAs in Phage Genomes Another major cloning vector is the bacteriophage lambda, which is depicted in Figure 18.42. The lambda genome is a linear, double-stranded DNA molecule approximately 50 kb in length. The modified strain used in most cloning experiments contains two cleavage sites for the enzyme EcoR1, which fragments the genome into three large segments. Conveniently, all the information essential for infection and cell lysis is

769 Mutant whose DNA contains two EcoR1 sites Extract DNA, treat with EcoR1

1

2

3

Separate fragments, discard middle section

1

18.13 | Enzymatic Amplification of DNA by PCR

3

Splice with eukaryotic fragment

(

) Recombinant DNA

Packaging of recombinant DNA into phage head

Infect host bacteria Bacterial lawn

millions of phage particles, each carrying a single copy of the same eukaryotic DNA fragment. Lambda phage is appealing as a cloning vector for a number of reasons: (1) the DNA is nicely packaged in a form in which it can be readily stored and easily extracted; (2) virtually every phage containing a recombinant DNA is capable of infecting a bacterial cell; and (3) a single petri dish can accommodate more than 100,000 different plaques. Once the phage plaques have been formed, the particular DNA fragment being sought is identified by a similar process of replica plating and in situ hybridization described in Figure 18.40 for cloning of recombinant plasmids. The cloning of larger sized DNA in either bacteria (as BACs) or yeast (as YACs) is discussed on page 774.

Clear phage plaque

Culture dish

contained in the two outer segments, so that the dispensable middle fragment can be replaced by a piece of eukaryotic DNA up to approximately 25 kb. Recombinant DNA molecules can be packaged into phage heads in vitro, and these genetically engineered phage particles can be used to infect host bacteria. (Phage DNA molecules lacking the insert are too short to be packaged.) Once in a bacterium, the eukaryotic DNA segment is amplified along with the viral DNA and then packaged into a new generation of virus particles, which are released when the cell is lysed. The released particles infect new cells, and soon a clear spot (or plaque) in the bacterial “lawn” is visible at the site of infection. Each plaque contains

18.13 Enzymatic Amplification of DNA by PCR

Figure 18.42 Protocol for cloning eukaryotic DNA fragments in lambda phage. The steps are described in the text.

In 1983, a new technique was conceived by Kary Mullis of Cetus Corporation that has become widely used to amplify specific DNA fragments without the need for bacterial cells. This technique is known as polymerase chain reaction (PCR). There are many different PCR protocols used for a multitude of different applications in which anywhere from one to a large population of related DNAs can be amplified. PCR amplification is readily adapted to RNA templates by first converting them to complementary DNAs using reverse transcriptase. The basic procedure used in PCR is depicted in Figure 18.43. The technique employs a heat-stable DNA polymerase, called Taq polymerase, that was originally isolated from Thermus aquaticus, a bacterium that lives in hot springs at temperatures above 90⬚C. In the simplest protocol, a sample of DNA is mixed with an aliquot of Taq polymerase and all four deoxyribonucleotides, along with a large excess of two short, synthetic DNA fragments (oligonucleotides) that are complementary to DNA sequences at the 3⬘ ends of the region of the DNA to be amplified. The oligonucleotides serve as primers (page 550) to which nucleotides are added during the following replication steps. The mixture is then heated to about 95⬚C, which is hot enough to cause the DNA molecules in the sample to separate into their two component strands. The mixture is then cooled to about 60⬚C to allow the primers to hybridize to the strands of the target DNA, and the temperature is raised to about 72⬚C, to allow the thermophilic polymerase to add nucleotides to the 3⬘ end of the primers. As the polymerase extends the primer, it selectively copies the target DNA, forming new complementary DNA strands. The temperature is raised once again, causing the newly formed and the original DNA strands to separate from each other. The sample is then cooled to allow the synthetic primers in the mixture to bind once again to the target DNA, which is now present at twice the original amount. This cycle is repeated over and over again, each time doubling the amount of the specific region of DNA that is flanked by the bound primers. Billions of copies of this one specific region can be generated in just a few hours using a thermal cycler that

770

First cycle

Reaction mixture contains target DNA sequence to be amplified,two primers (P1,P2),and heat-resistant polymerase

Target DNA 1 copy P2

P1

Heat-resistant polymerase Reaction mixture is heated to separate strands of target DNA.Subsequent cooling allows primers to hybridize to complementary sequences in target DNA

New DNA strand

Polymerase extends complementary strands from primers

2 identical copies First synthesis cycle results in two copies of target DNA sequence

Chapter 18 Techniques in Cell and Molecular Biology

Second cycle

Separate DNA strands hybridize primers

Extend new DNA strands

Second synthesis cycle results in four copies of target DNA sequence

4 identical copies

8

Figure 18.43 Polymerase chain reaction (PCR). As discussed in the text, the procedure takes advantage of a heatresistant DNA polymerase whose activity is not destroyed when the temperature is raised to separate the two strands of the double helix. With each cycle of duplication, the strands are separated, flanking segments (primers) bind to the ends of the selected region, and the polymerase copies the intervening segment.

automatically changes the temperature of the reaction mixture to allow each step in the cycle to take place.

Applications of PCR Amplifying DNA for Cloning or Analysis. Since its invention, PCR has found many uses. It can generate many copies of a specific DNA fragment prior to cloning the fragment, an

16

etc

efficient approach if the target sequence is known in sufficient detail that the nucleotide sequence of two complementary primers can be specified. This is particularly helpful in cases where the source DNA is very scarce, since PCR can generate large amounts of DNA from minuscule samples, such as that in a single cell. PCR has been used in criminal investigations to generate quantities of DNA from a spot of dried blood left on a crime suspect’s clothing or even from the DNA present

771

in part of a single hair follicle left at the scene of a crime. For this purpose, one selects regions of the genome for amplification that are highly polymorphic (i.e., vary at high frequency within the population), so that no two individuals will have the same-sized DNA fragments (as in Figure 10.18). This same procedure can be used to study DNA fragments from well-preserved fossil remains that may be millions of years old. The activity of DNA polymerase in PCR is also employed in DNA sequencing (see Section 18.14). Testing for the Presence of Specific DNA Sequences. Suppose you wanted to determine whether or not a tissue sample contains a particular virus. You could answer this question by a Southern hybridization (page 763) or you could use PCR. In the PCR approach, nucleic acid is isolated from the sample and PCR primers complementary to the viral DNA are added, along with the other PCR reagents. The reaction is then allowed to proceed. If the virus genome is present in the sample, the PCR primers will hybridize to it and the PCR reaction will generate a product. If the virus is not present, the PCR primers will not hybridize and no product will be generated. Thus, in this application, the PCR reaction itself serves as the detection system. Comparing DNA Molecules. If two DNA molecules have the same base sequence, they will yield the same PCR products in reactions with identical primers. This is the premise for quick assays that compare the similarity of two DNA samples such as genomic DNA from bacterial isolates. PCR is performed on the samples using several primers, which can be specifically designed or randomly generated. The products are separated by gel electrophoresis and compared. The more similar the sequences of the bacterial genomes, the more similar their PCR products will be.

By 1970, the amino acid sequence of a long list of proteins had been determined, yet virtually no progress had been made toward the sequencing of nucleotides in DNA. There were several reasons for this state of affairs. Unlike DNA molecules, polypeptides come in defined and manageable lengths; a given polypeptide species could be readily purified; a variety of techniques were available to cleave the polypeptide at various sites to produce overlapping fragments; and the presence of 20 different amino acids having widely varying properties made separation and sequencing of small peptides a straightforward task. Then in the mid-1970s, a revolution in DNA sequence technology took place. By 1977, the complete nucleotide sequence of an entire viral genome was reported, that of ␾X174, some 5375 nucleotides in length. This milestone in molecular biology was accomplished in the laboratory of Frederick Sanger, who had determined the first amino acid sequence of a polypeptide (insulin) some 25 years earlier. By 2001 the rough draft of the human genome sequence—equivalent to roughly 3 billion base pairs—was published; the product of years of work by hundreds of scientists. These advances in DNA sequencing were made possible by developments in several areas: molecular approaches to DNA sequencing, instrumentation that could be automated, more powerful and widely available computers, and software for data analysis. The initial key was the development of a feasible approach for determining the sequence of large DNA fragments. This advance itself became possible because of the discovery of restriction enzymes and the development of cloning technologies, which provided the means necessary to prepare a defined DNA fragment in sufficient quantity to carry out the necessary biochemical procedures. By 1980, a DNA sequencing methodology developed by Sanger and A. R. Coulson of the Medical Research Council in Cambridge, England, had gained widespread acceptance as the method of choice. Following the development of PCR, the Sanger-Coulson sequencing approach and PCR were merged into a sequencing procedure that combines the biochemistry of Sanger-Coulson with the repetitive cycles and automation of PCR. This so-called cycle sequencing became widely used in genome sequencing and its basic steps are outlined in Figure 18.44. Although the Sanger-Coulson approach is no longer used in major DNA sequencing projects, its historic importance as the technique employed in all of the earlier genome-sequencing studies, including that of the Human Genome Project, as well as the elegance of its biochemical methodology, makes a brief description a valuable learning tool in molecular biology. In this procedure, one begins with a population of identical template molecules, either a PCR product or a cloned DNA fragment. The template DNA is mixed with a primer that is complementary to the 3⬘ end of one strand of the region to be sequenced. The reaction mixture also contains the heat stable Taq DNA polymerase, all four deoxyribonucleoside triphosphate precursors (dNTPs), and a low concentration of modified precursors called dideoxyribonucleoside triphosphates, or ddNTPs. Each ddNTP (ddATP, ddGTP,

18.14 DNA Sequencing

Quantifying DNA or RNA Templates. PCR can also be used to determine how much of a specific nucleotide sequence (DNA or RNA) is present in a mixed sample. One approach to this quantitative PCR uses the binding of a dye specific for double-stranded DNA to quantify the amount of doublestranded product being generated. The rate of accumulation of product is proportional to the amount of template present in the sample. Another approach uses what have been called “molecular beacons.” These are short reporter oligonucleotides with a fluorophore bound to one end and a quencher molecule on the other end that hybridize in the middle of the target sequence to be amplified. As long as the short oligonucleotide is intact, the fluorophore and quencher are close enough in proximity that fluorescence is quenched. When the DNA polymerase synthesizes a new strand of DNA complementary to the template, its exonuclease activity (page 557) degrades the reporter oligonucleotide. The fluorophore is thus separated from the quencher and fluoresces. The amount of fluorophore liberated in a given PCR cycle is directly proportional to the number of template molecules being copied by the polymerase.

18.14 | DNA Sequencing

772 Figure 18.44 DNA sequencing. (a) The basic steps in sequencing a small hypothetical fragment by the Sanger-Coulson (dideoxy) technique, as described in the text. (b) Gel lanes in which fluorescently labeled daughter molecules have been separated. The color of the band is determined by the identity of the dideoxynucleotide at the 3⬘ end of the DNA strand. (c) The sequence of nucleotides in the template strand is interpreted by a computer that “reads” from the bottom to the top of the gel, using the intensity and wavelength of the fluorescent light as input. The computer generates an “electropherogram” showing the intensity and color of the detected fluorescence, along with the DNA sequence interpretation. (B,C: FROM LEROY HOOD AND DAVID GALAS, NATURE 421:445, 2003, FIG. 1C AND 1D. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LTD.)

5' GGAC AT GAGC T 3' 3' C C T GTAC T C GA 5' 1

Denature DNA

3' C C T GTAC T C GA 5'

2

3

d d GT P

GGA oligonucleotide primer + DNA polymerase + dATP, dGTP, dCTP, dTTP + Four ddNTPs, each labeled with a different fluorescent dye

3' C C T GTAC T C GA 5' G A 3' 5' G GGA d d AT P

d d C TP

ddTTP

4

GGA C ATG GGA C A GGA C GGA C A T GGA C AT GAG GGA C AT GA GGA C AT G AGC GGA C A T G A G C T

5

Chapter 18 Techniques in Cell and Molecular Biology

(a)

C ATGAGCT C ATGAGC C ATGAG C ATGA C ATG C AT CA C

ddCTP, and ddTTP) has been modified by the addition of a different-colored fluorescent dye to its 3⬘ end. The sequencing reaction begins, like PCR, by heating the mixture to a temperature that causes the two template strands to denature (Figure 18.44a, step 1). Next, the reaction is cooled so that the primer can hybridize to the template DNA (step 2). Note that, in contrast to PCR, only one primer is present, so that only one of the two strands of template DNA can hybridize to a primer. In step 3, the Taq polymerase adds dNTPs to the end of the primer that are complementary to the template molecule, synthesizing a new complementary strand of DNA. Every now and then, the polymerase inserts a ddNTP instead of a dNTP. Dideoxynucleotides lack a hydroxyl group at both their 2⬘ and 3⬘ positions. When one of these nucleotides has been incorporated onto the end of a growing chain, the lack of the 3⬘ OH makes it impossible for the polymerase to add another nucleotide, thus causing chain termination (Figure 18.44, step 4). Because the ddNTP is present at much lower concentration in the reaction mixture than the corresponding dNTP, the incorporation of the ddNTP is infrequent and random; it may be incorporated near the beginning of one chain, near the middle of another, or not until the end of a third chain. Regardless, when the ddNTP is incorporated, growth of the chain ceases at that point.

(b)

(c)

After the chain extension phase of the reaction is complete, the temperature is raised again to denature the new double-stranded DNA molecules. The cycle of hybridization, synthesis, and denaturation is repeated many times, generating a large population of daughter DNA strands that by now includes molecules in which a ddNTP has been incorporated at every position. For every A on the template strand, for example, there will be daughter molecules that terminate in a ddTTP at that position. When all cycles are complete, the reaction products are separated by electrophoresis on very thin capillary gels (Figure 18.44, step 5). High-resolution gel electrophoresis can separate fragments that differ by only one nucleotide in length, so that each successive band in the gel contains molecules that are one nucleotide longer than those in the previous band. Since each ddNTP was labeled with a unique fluorescent dye, the color of the band (as read by an automated laser detector) reveals the identity of the terminal nucleotide on each daughter molecule (Figure 18.44b). The order of the colors in the gel therefore corresponds to the base sequence of the template molecule (Figure 18.44c). Beginning about 2005, a revolution in DNA sequencing technology has taken place, driven by the goal of sequencing genomes—both human and others—rapidly and inexpensively. Costs for sequencing massive amounts of DNA have

773

strategies for determining nucleotide sequences, although most of them act on single DNA molecules (which eliminates the need for DNA amplification), generate long reads (e.g., 1000 nucleotides), and operate at lower cost and faster rates than 2d generation instruments. We will briefly consider a couple of these sequencing strategies. In one case, sequencing is accomplished by pulling each DNA molecule through a tiny hole, or “nanopore” and identifying each nucleotide, one at a time, as it passes through the opening. Nucleotide identification is based on differences in ionic properties among the four nucleotides that can be detected as each nucleotide passes one after another through the nanopore. With large numbers of nanopores operating simultaneously, the potential for very rapid sequencing is possible. Other 3d generation sequencers work on a “sequencing-by-synthesis” approach. In one such instrument, a laser-based detection system identifies fluorescently labeled nucleotides as they are incorporated by a single DNA polymerase moving processively along a DNA strand. A hundred thousand or more of these reactions are monitored simultaneously on different DNA strands, thereby generating large amounts of data during short incubation periods (“runs”). Once the nucleotide sequence of a segment of DNA has been determined, various software tools can be employed to analyze it. For example, the amino acid sequence encoded by the DNA can be determined and compared to other known amino acid sequences to provide information about the polypeptide’s possible function. The amino acid sequence also provides clues as to the tertiary structure of the protein, particularly those parts of the polypeptide that may act as membrane-spanning segments of integral membrane proteins. The nucleotide sequence itself can also be compared to other known nucleotide sequences. Such comparisons can be used to assess the evolutionary relatedness or history of the DNA sequences, to identify the DNA fragment just sequenced, or to compare the genomic features of various organisms or individuals.

18.15 | DNA Libraries DNA cloning is often used to produce DNA libraries, which are collections of cloned DNA fragments. Two basic types of DNA libraries can be created: genomic libraries and cDNA libraries. Genomic libraries are produced from total DNA extracted from nuclei and contain all of the DNA sequences of the species. Once a genomic library of a species is available, researchers can use the collection to isolate specific DNA sequences, such as those containing the human insulin gene. cDNA libraries, on the other hand, are derived from DNA copies of an RNA population. cDNA libraries are typically produced from messenger RNAs present in a particular cell type and thus correspond to the genes that are active in that type of cell.

Genomic Libraries We will first examine the production of a genomic library. In one approach, genomic DNA is treated with one or two restriction enzymes that recognize very short nucleotide sequences

18.15 DNA Libraries

decreased dramatically over the past few years, and by 2012 it became possible to sequence the genome of a given person for a cost somewhere in the range of about $5,000, which is orders of magnitude cheaper than just a few years earlier. This progress has been made possible by the development of entirely new strategies for DNA sequencing. The first of these new strategies is referred to as massively parallel sequencing (or 2d generation sequencing) and has dominated genomic sequencing efforts for the past several years. Unlike the sequencing steps of the original Human Genome Project, massively parallel sequencing technologies do not require that pieces of the genome be cloned in yeast or bacteria. Instead, fragments are prepared directly from the whole genome and each is subsequently amplified to form a huge population. Like the Sanger-Coulson approach, massively parallel sequencing techniques are based on polymerase-dependent DNA synthesis, but they do not utilize premature strand termination nor do they require separation of strands by electrophoresis, which limits the number of samples that can be simultaneously monitored. Instead, they accomplish the direct identification of individual nucleotides as they are being incorporated by the polymerases in real time. A number of different instruments are available, each capable of carrying out a variation of this type of rapid automated DNA sequencing. In all of these cases, huge numbers (up to ⬃ 109) of DNA molecules are immobilized on a surface and then incubated with DNA polymerase in the presence of the four different NTPs. Various strategies are used to identify the nucleotides that are sequentially incorporated into each of the complementary strands as they are synthesized on the massive numbers of “parallel” DNA templates. The strands that are synthesized in all of these platforms are relatively short (less than 100 bases in length) compared to the earlier sequencers that utilized electrophoretic methods (approximately 800 bases in length). These newer sequencers are also more error-prone than the instruments used in first generation efforts. However, there are so many copies of each DNA strand being read simultaneously that high accuracy can be achieved by determining the most abundant sequence generated. The short segments being synthesized (called “reads”) do become a problem when researchers are trying to determine the DNA sequence of the genome of a new species, because it can be very difficult to stitch together (assemble) a vast number of short reads into the very long DNA molecules present in a eukaryotic chromosome. Consequently, these DNA sequencing technologies are best suited for studying individual genomes from a species, such as the human species, where researchers already possess a reference genome into which a given DNA fragment can be placed. Despite the challenges, genomes of new species, such as the panda, have been assembled using these instruments. Another problem with the use of massively parallel sequencers is that the data cannot be sorted into a person’s separate maternal and paternal haploid genomes, or their haplotypes (page 419). This may limit their usefulness in applications for genomic medicine. Over the past few years, a new collection of DNA sequencing instruments, often referred to as 3d generation sequencers, have been (and will be) introduced into the market. As a group, these sequencers use very different mechanistic

774

Chapter 18 Techniques in Cell and Molecular Biology

under conditions of low enzyme concentration, such that only a small percentage of susceptible sites are actually cleaved. Two commonly used enzymes that recognize tetranucleotide sequences are HaeIII (recognizes GGCC) and Sau3A (recognizes GATC). A given tetranucleotide would be expected to occur by chance with such a high frequency that any sizable segment of DNA is sensitive to fragmentation. Once the DNA is partially digested, the digest is fractionated by gel electrophoresis or density gradient centrifugation, and those fragments of suitable size (e.g., 20 kb in length) are incorporated into lambda phage particles. These phage are used to generate the million or so plaques needed to ensure that every single segment of a mammalian genome is represented. Because the DNA is treated with enzymes under conditions in which most susceptible sites are not being cleaved, for all practical purposes, the DNA is randomly fragmented. Once the phage recombinants are produced, they can be stored for later use and, as such, constitute a permanent collection of all the DNA sequences present in the genome of the species. Whenever an investigator wants to isolate a particular sequence from the library, the phage particles can be grown in bacteria and the various plaques (each originating from the infection of a single recombinant phage) screened for the presence of that sequence by in situ hybridization. The use of randomly cleaved DNA has an advantage in the construction of a library because it generates overlapping fragments that can be used in the analysis of regions of the chromosome extending out in both directions from a particular sequence, a technique known as chromosome walking. For example, if one isolates a fragment containing the coding region of a globin gene, that particular fragment can be labeled and used as a probe to screen the genomic library for fragments with which it overlaps. The process is then repeated using the new fragments as labeled probes in successive screening steps, as one gradually isolates a longer and longer part of the original DNA molecule. Using this approach, one can study the organization of linked sequences in an extended region of a chromosome. Cloning Larger DNA Fragments in Specialized Cloning Vectors Neither plasmid nor lambda phage vectors are suitable for cloning DNAs larger than about 20 to 25 kb in length. Several vectors have been developed that allow investigators to clone much larger pieces of DNA. One of the most important of these vectors is the yeast artificial chromosome (YAC), which can accept segments of foreign DNA as large as 1000 kb (one million base pairs). As the name implies, YACs are artificial versions of a normal yeast chromosome. They contain all of the elements of a yeast chromosome that are necessary for the structure to be replicated during S phase and segregated to daughter cells during mitosis, including one or more origins of replication, telomeres at the ends of the chromosomes, and a centromere to which the spindle fibers can attach during chromosome separation. In addition to these elements, YACs are constructed to contain (1) a gene whose encoded product allows those cells containing the YAC to be selected from those that lack the element and (2) the DNA fragment to be cloned. Like other cells, yeast cells can take up

DNA from their medium, which provides the means by which YACs are introduced into the cells. Over the past few years, laboratories involved in sequencing genomes have relied heavily on an alternate cloning vector, called a bacterial artificial chromosome (BAC), which is also able to accept large foreign DNA fragments (up to about 500 kb). BACs are specialized bacterial plasmids (F factors) that contain a bacterial origin of replication and the genes required to regulate their own replication. BACs have the advantage over YACs in high-speed sequencing projects because they can be cloned in E. coli, which readily picks up exogenous DNA, has an extremely short generation time, can be grown at high density in simple media, and does not “corrupt” the cloned DNA through recombination. DNA fragments cloned in YACs and BACs are typically greater than 100 kb in length. Fragments of such large size are usually produced by treatment of DNAs with restriction enzymes that recognize particularly long nucleotide sequences (7–8 nucleotides) containing CG dinucleotides. As noted on page 531, CG dinucleotides have special functions in the mammalian genome, and presumably because of this they do not appear nearly as often as would be predicted by chance. The restriction enzyme Not1, for example, which recognizes the 8-nucleotide sequence GCGGCCGC, typically cleaves mammalian DNA into fragments several hundred thousand base pairs long. These fragments can then be incorporated into YACs or BACs and cloned within host yeast or bacterial cells.

cDNA Libraries Up to this point, the discussion has been restricted to cloning DNA fragments isolated from extracted DNA, that is, genomic fragments. When working with genomic DNA, one is generally seeking to isolate a particular gene or family of genes from among hundreds of thousands of unrelated sequences. In addition, the isolation of genomic fragments allows one to study a variety of topics, including (1) regulatory sequences flanking the coding portion of a gene; (2) noncoding intervening sequences; (3) various members of a multigene family, which often lie close together in the genome; (4) evolution of DNA sequences, including their duplication and rearrangement as seen in comparisons of the DNA of different species; and (5) interspersion of transposable genetic elements. The cloning of cDNAs, as opposed to genomic DNAs, has been especially important in the analysis of gene expression and alternative splicing. To produce a cDNA library, a population of mRNAs is isolated and used as a template by reverse transcriptase to form a population of DNA–RNA hybrids, as shown in Figure 18.45a. The DNA–RNA hybrids are converted to a double-stranded cDNA population by nicking the RNA with RNase H and replacing it with DNA by DNA polymerase I. The double-stranded cDNA is then combined with the desired vector (in this case a plasmid) and cloned as shown in Figure 18.45b. Messenger RNA populations typically contain thousands of different messages, but individual species may be present in markedly different numbers (abundance). As a result, a cDNA library has to contain a

775 Figure 18.45 Synthesizing cDNAs for cloning in a plasmid. (a) In this method of cDNA formation, a short poly(dT) primer is bound to the poly(A) of each mRNA, and the mRNA is transcribed by reverse transcriptase (which requires the primer to initiate DNA synthesis). Once the DNA–RNA hybrid is formed, the RNA is nicked by treatment with RNase H, and DNA polymerase I is added to digest the RNA and replace it with DNA, just as it does during DNA replication in a bacterial cell (page 557). (b) To prepare the blunt-ended cDNA for cloning, a short stretch of poly(G) is added to the 3⬘ ends of the cDNA and a complementary stretch of poly(C) is added to the 3⬘ ends of the plasmid DNA. The two DNAs are mixed and allowed to form recombinants, which are sealed and used to transform bacterial cells in which they are cloned.

Poly(A) 5'

3' mRNA Anneal poly(dT) primer to poly(A) tail of mRNA

Make DNA copy with reverse transcriptase

Primer

DNA

Treat hybrid with RNase H, which nicks the RNA, leaving free 3'-OH groups 3'-OH

3'-OH

DNA Add DNA polymerase I, which uses the RNA fragments as primers and replaces the RNA with DNA

Double-stranded cDNA (a)

+ cDNA

GGGG

dCTP

GGGG CCCC

Mix, allow formation of recombinant DNA, ligate

(b)

Bacterium in which DNA can be cloned

CCCC

18.16 | DNA Transfer into Eukaryotic Cells and Mammalian Embryos We have discussed in previous sections how eukaryotic genes can be isolated, modified, and amplified. In this section we will consider some of the ways in which genes can be introduced into eukaryotic cells, where they are generally transcribed and translated. One of the most widely used strategies to accomplish this goal is to incorporate the DNA into the genome of a nonreplicating virus and allow the virus to infect the cell. Viral-mediated gene transfer is called transduction. Depending on the type of virus used, the gene of interest can be expressed transiently for a period of hours to days, or it can be stably

18.16 DNA Transfer into Eukaryotic Cells and Mammalian Embryos

dGTP

Plasmid DNA Add enzyme deoxynucleotidyl transferase in presence of dGTP or dCTP

million or so different cDNA clones to be certain that the rarer mRNAs will be represented. In addition, reverse transcriptase is not a very efficient enzyme and tends to fall off its template mRNA before the copying job is completed. As a result, it can be difficult to obtain a population of full-length cDNAs. As with experiments using genomic DNA fragments, clones must be screened to isolate one particular sequence from a heterogeneous population of recombinant molecules. The analysis of cloned cDNAs serves several functions. It is generally easier to study a diverse population of cDNAs than the corresponding population of mRNAs, so one can use the cDNAs to learn about the diversity and abundance of mRNAs present in a cell, the percentage of mRNAs shared by two different types of cells, or the variety of alternatively spliced mRNAs generated from a specific primary transcript. A single cloned and amplified cDNA molecule is also very useful. In more practical endeavors, the purified cDNA can be readily sequenced to provide a shortcut in determining the amino acid sequence of a polypeptide; labeled cDNAs are used as probes to screen for complementary sequences among recombinant clones; and because they lack introns, cDNAs have an advantage over genomic fragments when one is attempting to synthesize eukaryotic proteins in bacterial cell cultures.

Chapter 18 Techniques in Cell and Molecular Biology

776

integrated into the genome of the host cell. Stable integration is usually accomplished using modified retroviruses, which contain an RNA genome that is reverse transcribed into DNA inside the cell. The DNA copy is then inserted into the DNA of the host chromosomes. Retroviruses have been used in many of the recent attempts at gene therapy to transfer a gene into the cells of a patient who is lacking that gene. Generally, these clinical trials have not had much success due to the low efficiency of infection of current viral vectors. A number of procedures are available to introduce naked DNA into cultured cells, a process called transfection. Most often the cells are treated with either calcium phosphate or DEAE-dextran, both of which form a complex with the added DNA that promotes its adherence to the cell surface. It is estimated that only about 1 in 105 cells takes up the DNA and incorporates it stably into the chromosomes. It is not known why this small percentage of cells in the population are competent to be transfected, but those that are transfected generally pick up several fragments. One way to select for those cells that have taken up the foreign DNA is to include a gene that allows the transfected cells to grow in a particular medium in which nontransfected cells cannot survive. Because transfected cells typically take up more than one fragment, the gene used for selection need not be located on the same DNA fragment as the gene whose role is being investigated (called the transgene). Two other procedures to transfect cells are electroporation and lipofection. In electroporation, the cells are incubated with DNA in special vials that contain electrodes that deliver a brief electric pulse. Application of the current causes the plasma membrane to become transiently permeable to DNA molecules, some of which find their way into the nucleus and become integrated into the chromosomes. In lipofection, cells are treated with DNA that is bound to positively charged lipids (cationic liposomes) that are capable of fusing with the lipid bilayer of the cell membrane and delivering the DNA to the cytoplasm. One of the most direct ways to introduce foreign genes into a cell is to microinject DNA directly into the cell’s nucleus. The nuclei of oocytes and eggs are particularly well suited for this approach. Xenopus oocytes, for example, have long been used to study the expression of foreign genes. The nucleus of the oocyte contains all the machinery necessary for RNA synthesis; when foreign DNA is injected into the nucleus, it is readily transcribed. In addition, the RNAs synthesized from the injected templates are processed normally and transported to the cytoplasm, where they are translated into proteins that can be detected immunologically or by virtue of their specific activity. Another favorite target for injected DNA is the nucleus of a mouse embryo (Figure 18.46). The goal in such experiments is not to monitor the expression of the gene in the injected cell, but to have the foreign DNA become integrated into the egg’s chromosomes to be passed on to all the cells of the embryo and subsequent adult. Animals that have been genetically engineered so that their chromosomes contain foreign genes are called transgenic animals, and they provide a means to monitor when and where in the embryo particular genes are expressed (as in Figure 12.34), and to determine the

Figure 18.46 Microinjection of DNA into the nucleus of a recently fertilized mouse egg. The egg is held in place by a suction pipette shown at the right, while the injection pipette is shown penetrating the egg at the left. (FROM THOMAS E. WAGNER ET AL., PROC. NAT ’L. ACAD. SCI. U.S.A. 78:6377, 1981.)

impact of extra copies of particular genes on the development and life of the animal. Transgenic Animals and Plants In 1981, Ralph Brinster of the University of Pennsylvania and Richard Palmiter of the University of Washington succeeded in introducing a gene for rat growth hormone (GH) into the fertilized eggs of mice. The injected DNA was constructed so as to contain the coding portion of the rat GH gene in a position just downstream from the promoter region of the mouse metallothionein gene. Under normal conditions, the synthesis of metallothionein is greatly enhanced following the administration of metals, such as cadmium or zinc, or glucocorticoid hormones. The metallothionein gene carries a strong promoter, and it was hoped that placing the GH gene downstream from it might allow the GH gene to be expressed following treatment of the transgenic mice with metals or glucocorticoids. As illustrated in Figure 18.47, this expectation was fully warranted. On the practical side, transgenic animals provide a mechanism for creating animal models, which are laboratory animals that exhibit a particular human disease they would normally not be subject to. This approach is illustrated by the transgenic mice that carry a gene encoding a mutant version of the human amyloid precursor protein (APP). As discussed on page 68, these mice develop neurological and behavioral symptoms reminiscent of Alzheimer’s disease and are an important resource in the development of therapies for this terrible disease. Transgenic animals are also being developed as part of agricultural biotechnology. For example, pigs born with foreign growth hormone genes incorporated into their chromosomes grow much leaner than control animals lacking the genes. The meat of the transgenic animals is leaner because the excess growth hormone stimulates the conversion of nutrients into protein rather than fat. Plants are also prime candidates for genetic engineering. It was indicated on page 229 that whole plants can be grown

777

Seeds

Ti plasmid Culture dish

Allow seeds to germinate

+ Foreign DNA Shoot

Shoot apex

Figure 18.47 Transgenic mice. This photograph shows a pair of littermates at age 10 weeks. The larger mouse developed from an egg that had been injected with DNA containing the rat growth hormone gene placed downstream from a metallothionein promoter. The larger mouse weighs 44 g; the smaller, uninjected control weighs 29 g. The rat GH gene was transmitted to offspring that also grew larger than the controls. (COURTESY OF RALPH L. BRINSTER, FROM WORK REPORTED IN R. D. PALMITER ET AL., NATURE 300:611, 1982. REPRINTED BY PERMISSION FROM MACMILLAN PUBLISHERS LTD.)

Recombinant Ti plasmid

Cut shoot apex containing meristem

Introduce into Agrobacterium tumefaciens bacteria

Shoot apex

Transform shoot apices

+

Transformed shoots

Transfer shoot apices to rooting medium

Transfer to soil

Figure 18.48 Formation of transgenic plants using the Ti plasmid. The transgene is spliced into the DNA of the Ti plasmid, which is reintroduced into host bacteria. Bacteria containing the recombinant plasmid are then used to transform plant cells, in this case cells of the meristem at the tip of a dissected shoot apex. The transformed shoots are transferred to a selection medium where they develop roots. The rooted plants can then be transferred to potting soil.

18.16 DNA Transfer into Eukaryotic Cells and Mammalian Embryos

from individual cultured plant cells. This provides the opportunity to alter the genetic composition of plants by introducing DNA into the chromosomes of cultured cells that can subsequently be grown into mature plants. One way to introduce foreign genes is to incorporate them into the Ti plasmid of the bacterium Agrobacterium tumefaciens. Outside of the laboratory, this bacterium lives in association with dicotyledonous plants, where it causes the formation of tumorous lumps called crown galls. During an infection, a section of the Ti plasmid termed the T-DNA region is passed from the bacterium into the plant cell. This part of the plasmid becomes incorporated into the plant cell’s chromosomes and induces the cell to proliferate and provide nutrients for the bacterium. The Ti plasmid can be isolated from bacteria and linked with foreign genes to produce a recombinant plasmid that is taken up in culture by undifferentiated dicotyledonous plant cells, including those of carrots and tobacco. Figure 18.48 shows the use of recombinant Ti plasmids to introduce foreign DNA into meristem cells at the tip of newly formed shoots. The shoots can then be rooted and grown into mature plants. This technique, which is called T-DNA transformation, has been used to transform plant cells with genes derived from bacteria that code for insect-killing toxins, thus protecting the plants from insect predators. Other procedures have been developed to introduce foreign genes into cells of monocotyledonous plants. In one of the more exotic approaches, plant cells can serve as targets for microscopic, DNA-coated tungsten pellets fired by a “gene

778

gun.” This technique has been used to genetically alter a number of different types of plant cells. The two most important activities that plant geneticists might improve by genetic engineering are photosynthesis and nitrogen fixation, both crucial bioenergetic functions. Any significant improvement in photosynthetic efficiency would mean great increases in crop production. It is hoped that a modified version of the CO2-fixing enzyme might be engineered that is less susceptible to photorespiration (page 229). Nitrogen fixation is an activity carried out by certain genera of bacteria (e.g., Rhizobium) that live in a symbiotic relationship with certain plants (e.g., soybean, peanut, alfalfa, and pea). The bacteria reside in swellings, or leguminous nodules, located on the roots, where they remove N2 from the atmosphere, reduce it to ammonia, and deliver the product to the cells of the plant. Geneticists are looking for a way to isolate genes involved in this activity from the bacterial genome and introduce them into the chromosomes of nonleguminous plants that presently depend heavily on added fertilizer for their reduced nitrogenous compounds. Alternatively, it might be possible to alter the genome of either plant or bacteria so that new types of symbiotic relationships can be developed.

Chapter 18 Techniques in Cell and Molecular Biology

18.17 | Determining Eukaryotic Gene Function by Gene Elimination or Silencing Until fairly recently, investigators discovered new genes and learned of their function by screening for mutants that exhibited abnormal phenotypes (see page 277 on the study of protein secretion). It was only through the process of random mutation that the existence of genes became apparent. This process of learning about genotypes by studying mutant phenotypes is known as forward genetics. Since the development of techniques for gene cloning and DNA sequencing, researchers have been able to identify and study genes directly without knowing anything about the function of their encoded protein. This condition has become especially familiar in recent years with the sequencing of entire genomes and the identification of thousands of genes whose function remains unknown. Over the past two decades, researchers have developed means to carry out reverse genetics, which is a process of determining phenotype (i.e., function) based on the knowledge of genotype. The basic approach of reverse genetics is to eliminate the function of a specific gene, and then determine what effect the elimination of that function has on phenotype. We will first consider how researchers introduce specific mutations into genes in vitro, then discuss two widely used techniques for eliminating gene function in vivo.

In Vitro Mutagenesis As is evident throughout this book, the isolation of naturally occurring mutants has played an enormously important role in determining the function of genes and their products. But

natural mutations are rare events, and it isn’t feasible to use such mutations to study the role of particular amino acid residues in the function of a particular protein. Rather than waiting for an organism to appear with an unusual phenotype and identifying the responsible mutation, researchers can mutate the gene (or its associated regulatory regions) in a desired way and observe the resulting phenotypic change. These techniques, collectively termed in vitro mutagenesis, require that the gene, or at least the gene segment, to be mutated has been cloned. A procedure developed by Michael Smith of the University of British Columbia called site-directed mutagenesis (SDM) allows researchers to make very small, specific changes in a DNA sequence, such as the substitution of one base for another, or the deletion or insertion of a very small number of bases. SDM is usually accomplished by first synthesizing a DNA oligonucleotide containing the desired change, allowing this oligonucleotide to hybridize to a singlestranded preparation of the normal DNA, and then using the oligonucleotide as a primer for DNA polymerase. The polymerase elongates the primer by adding nucleotides that are complementary to the normal DNA. The modified DNA can then be cloned and the effect of the genetic alteration determined by introducing the DNA into an appropriate host cell. Scientists typically use SDM to ask very targeted questions about the function of a gene or protein. For example, they might change one amino acid into another to gain insight into the role of that specific site in the overall function of a protein. Alternatively, they might introduce small changes in the regulatory region of a gene and determine the effect on gene expression. If the intent of site-directed mutagenesis is simply to eliminate the function of a gene, less specific methods can be used. For example, cutting a gene sequence at a restriction site, using DNA polymerase to make the single-stranded regions of the sticky ends into double-stranded DNA, and then ligating those ends back together can destroy the reading frame of a protein. In other instances, an entire restriction fragment might be removed from the gene. Constructing a mutation in vitro is only one aspect of reverse genetics. To study the effect of an engineered mutation on phenotype, it is necessary to substitute the mutant allele for the normal gene in the organism in question. The development of a technique for introducing mutations into the genome of the mouse opened the door for reverse genetic studies in mammals, and literally revolutionized the study of mammalian gene function.

Knockout Mice We have described in many places in this text the phenotypes of mice that lack a functional copy of a particular gene. For example, it was noted that mice lacking a functional copy of the p53 gene invariably develop malignant tumors (page 677). These animals, which are called knockout mice, can provide a unique insight into the genetic basis of a human disease, as well as a basis for studying the various cellular activities in which the product of a particular gene might be engaged. Knockout mice are born as the result of a series of experimental procedures depicted in Figure 18.49.

779 Inner cell mass (contains ES cells)

Monolayer of nondividing cells

Trophectoderm cells

Dissociate cells of blastocyst and culture ES cells

Blastocyst

ES cells

Transfect cells with DNA containing mutant gene X

Transfected ES cell (heterozygous for mutant gene X)

ES cells

Grow cells in selective medium

Host is homozygous recessive for black coat color (a/a)

Inject transfected X⫹/⫺ ES cells into host blastocyst

Chimeric blastocyst containing inner cell mass with two types of cells

To determine whether germ cells are derived from injected ES cells, mate the chimeric mouse with member of black strain

5

Nonchimeric mouse with brown (A/a) coat, heterozygous for gene X (X⫹/⫺). Knockout mice are produced by mating two of these heterozygotes, and picking out X–/– homozygotes.

Figure 18.49 Formation of knockout mice. The steps are described in the text.

An alternate technique for gene elimination has recently been used to produce the first rat knockout. In this case, a specific gene in the rat genome was targeted by injection into a rat zygote (one-stage embryo) of mRNA encoding a type of enzyme called a zinc-finger nuclease (ZFN). ZFNs are artificial restriction enzymes that are produced in the lab by fusing a gene encoding a zincfinger DNA binding domain (page 520) with a DNA cleaving domain. ZFNs cut both DNA strands at a specific nucleotide sequence that is determined by the genetically engineered amino acid sequence of the enzyme. These proteins can be customized to attack a large array of specific sequences and thus potentially can be used to knockout virtually any gene in the genome. Because the same target DNA sequence is present on both maternal and paternal alleles, injection of a preparation of one of these enzymes into a cell deletes both copies of the target gene thus generating homozygous knockouts without having to go through a heterozygous intermediate as is the case using homologous recombination shown in Figure 18.49. Another class of artificial proteins called TAL effector nucleases (or TALENs) has recently been generated as endonucleases that can be genetically engineered to target any specific DNA sequence. It is too early to know whether or not these nucleases will find widespread use in the formation of knockouts but they have already become important gene-editing tools.

18.17 Determining Eukaryotic Gene Function by Gene Elimination or Silencing

Chimeric mouse with black (a/a) and brown (A/a) pigmented coat

The various procedures used to generate knockout mice were developed in the 1980s by Mario Capecchi at the University of Utah, Oliver Smithies at the University of Wisconsin, and Martin Evans at Cambridge University. The first step is to isolate an unusual type of cell that has virtually unlimited powers of differentiation. These cells, called embryonic stem cells (page 21), are found in the mammalian blastocyst, which is an early stage of embryonic development comparable to the blastula stage of other animals. A mammalian blastocyst (Figure 18.49) is composed of two distinct parts. The outer layer makes up the trophectoderm, which gives rise to most of the extraembryonic membranes characteristic of a mammalian embryo. The inner surface of the trophectoderm contacts a cluster of cells called the inner cell mass that projects into the spacious cavity (the blastocoel). The inner cell mass gives rise to the cells that make up the embryo. The inner cell mass contains the embryonic stem (ES) cells, which differentiate into all of the various tissues of which a mammal is composed. ES cells can be isolated from blastocysts and cultured in vitro under conditions where the cells grow and proliferate. The ES cells are then transfected with a DNA fragment containing a nonfunctional, mutant allele of the gene to be knocked out, as well as antibiotic-resistance genes that can be used to select for cells that have incorporated the altered DNA into their genome. Of those cells that take up the DNA, approximately one in 104 undergo a process of homologous recombination in which the transfected DNA replaces the homologous DNA sequence5. Through use of this procedure, ES cells that are heterozygous for the gene in question are produced and then selected for on the basis of their antibiotic resistance. In the next step, a number of these donor ES cells are injected into the blastocoel of a recipient mouse embryo. In the protocol depicted in Figure 18.49, the recipient embryo is obtained from a black strain. The injected embryo is then implanted into the oviduct of a female mouse that has been hormonally prepared to carry the embryo to term. As the embryo develops in its surrogate mother, the injected ES cells join the embryo’s own inner cell mass and contribute to formation of embryonic tissues, including the germ cells of the gonads. These chimeric mice can be

780

Chapter 18 Techniques in Cell and Molecular Biology

recognized because their coat contains characteristics of both the donor and recipient strains. To determine whether the germ cells also contain the knockout mutation, the chimeric mice are mated to a member of an inbred black strain. If the germ cells do contain the knockout mutation, the offspring will be heterozygous for the gene in all of their cells. Heterozygotes can be distinguished by their brown coat coloration. These heterozygotes are then mated to one another to produce offspring that are homozygous for the mutant allele. These are the knockout mice that lack a functional copy of the gene. Any gene within the genome, or any DNA sequence for that matter, can be altered in any desired manner using this experimental approach. In some cases, the deletion of a particular gene can lead to the absence of a particular process, which provides convincing evidence that the gene is essential for that process. Often, however, the deletion of a gene that is thought to participate in an essential process causes little or no alteration in the animal’s phenotype. Such results can be difficult to interpret. It is possible, for example, that the gene is not involved in the process being studied or, as is usually the case, the absence of the gene product is compensated by the product of an entirely different gene. Compensation by one gene for another can be verified by producing mice that lack both of the genes in question (i.e., a double knockout). In other cases, the absence of a gene leads to the death of the mouse during early development, which also makes it difficult to determine the role of the gene in cellular function. Researchers are often able to get around this problem by using a technique that allows a particular gene to be switched off in only one or several desired tissues, while the gene is expressed in the remainder of the animal. These conditional knockouts, as they are called, allow researchers to study the role of the gene on the development or function of the affected tissue. A current goal of the International Knockout Mouse Consortium is to generate conditional knockouts of every gene in the mouse genome.

RNA Interference Knockout mice have been an invaluable strategy to learn about gene function, but the generation of these animals is laborious and costly. In the past few years a new technique in the realm of reverse genetics has come into widespread use. As discussed on page 457, RNA interference (RNAi) is a process in which a specific mRNA is degraded due to the presence of a small double-stranded siRNA whose sequence is contained within the mRNA sequence. The functions of plant, nematode, or fruit fly genes can be studied by simply introducing an siRNA into one of these organisms by various means and examining the phenotype of the organism that results from depletion of the corresponding mRNA. Determining the roles of genes at the cellular level, rather than the organismal level, can be accomplished by studying the effects of these same siRNAs on the activities of cultured cells. Using these approaches, researchers can gather information about the functions of large numbers of genes in a relatively short period of time. As discussed on page 457, RNAi can be used to study

gene function in mammalian cells by incubating the cells with small dsRNAs encapsulated in lipids or by genetically engineering the cells to produce the siRNAs themselves. Once inside the cell, the siRNA guides the degradation of the target mRNA, leaving the cell unable to produce additional protein encoded by that gene. Any deficiencies in the phenotype of the cell can be attributed to a marked reduction in the level of the protein being investigated. As in the case of gene knockouts, the absence of a phenotype following treatment with an siRNA does not necessarily mean that the gene in question is not involved in a particular process, as another gene may be capable of compensating for the absence of expression of the gene being targeted. Libraries containing thousands of siRNAs, or vectors containing DNA encoding these RNAs, are also available for the study of gene function in a number of model organisms and in human cells. Researchers using these libraries can study the effects of the elimination of a gene’s expression on virtually any cellular process. It should be possible, for example, to determine which genes are involved in mitotic spindle assembly by treating cells with a library of siRNAs from across the entire genome and then screening all of the treated cells at the time of cell division for the presence of an abnormal mitotic spindle. Figure 18.50 shows a gallery of images of cultured Drosophila cells in the metaphase stage of mitosis. Each of the cells depicted in this gallery had been treated with an siRNA directed against a different gene in the Drosophila genome. Over 14,000 different siRNAs were tested in this study, of which approximately 200 caused the treated cells to have an abnormal metaphase spindle. More than half of the siRNAs that affected mitotic spindle assembly targeted genes that were not previously known to be involved in the formation of this structure, thus providing new insights into the functions of these genes. The images in Figure 18.51 show the effects of a single siRNA that targets a gene known to be involved in mitotic spindle assembly and other activities during mitosis. In this case, the cultured mammalian cell seen on the right had been transfected with an siRNA that targets mRNAs encoding the Aurora B kinase (page 596). The resulting knockdown of this enzyme has seriously disturbed chromosome segregation.

18.18 | The Use of Antibodies As discussed in Chapter 17, antibodies (or immunoglobulins) are proteins produced by lymphoid tissues in response to the presence of foreign materials, or antigens. One of the most striking properties of antibodies, and one that makes them so useful to biological researchers, is their remarkable specificity. A cell may contain thousands of different proteins, yet a preparation of antibodies binds only to those select molecules within the cell having a small part that fits into the antigenbinding sites of the antibody molecules. Antibodies can often be obtained that distinguish between two polypeptides that differ by only a single amino acid or by a single posttranslational modification (as on page 527).

781

Tubulin

DNA

Figure 18.51 Determining gene function by RNA interference. (Left) An untreated, cultured mammalian cell at metaphase of mitosis. (Right) The same type of cell that has been treated with a 21nucleotide dsRNA designed to induce the destruction of mRNAs encoding Aurora B kinase, a protein involved in the metaphase spindle checkpoint. The chromosomes in this cell lie adjacent to the mitotic spindle, suggesting the absence of kinetochore–microtubule interactions. (FROM CLAIRE DITCHFIELD ET AL., J. CELL BIOL. 161:276, 2003, FIG. 8D. REPRODUCED WITH PERMISSION OF THE ROCKEFELLER UNIVERSITY PRESS.)

Figure 18.50 Determining gene function by RNA interference. The figure shows a gallery of immunofluorescence images of cultured Drosophila cells that had been incubated with various double-stranded siRNAs. The cells shown here had accumulated in metaphase of mitosis as the result of a separate treatment that blocked progression into anaphase. During the course of this study, over 4 million cells were examined. It was possible to screen such a large number of cells by having a computer select the images of cells that were in metaphase and then crop and assemble the images into panels as shown in this photograph. The images could then be screened for the presence of abnormal mitotic spindles either by human observers or by computer analysis using programs designed to detect specific features of an abnormal spindle. (An even larger study of the same type can be found in Nature 464:721, 2010.) (COURTESY OF GOHTA GOSHIMA AND RONALD D. VALE.)

18.18 The Use of Antibodies

Biologists have long taken advantage of antibodies and have developed a wide variety of techniques involving their use. There are basically two different approaches to the preparation of antibody molecules that interact with a given antigen. In the traditional approach, an animal (typically a rabbit or goat) is repeatedly injected with the antigen, and after a period of several weeks, blood is drawn that contains the desired antibodies. Whole blood is treated to remove cells and

clotting factors so as to produce an antiserum, which can be tested for its antibody titer and from which the immunoglobulins can be purified. Although this method of antibody production remains in use, it suffers from certain inherent disadvantages. Because of the mechanism of antibody synthesis, an animal invariably produces a variety of different species of immunoglobulin, that is, immunoglobulins with different V regions in their polypeptide chains, even when challenged with a highly purified preparation of antigen. An antiserum that contains a variety of immunoglobulins that bind to the same antigen is said to be polyclonal. Because immunoglobulins are too similar in structure to be fractionated, it becomes impossible to obtain a preparation of a single purified species of antibody by this technique. In 1975, Cesar Milstein and Georges Köhler of the Medical Research Council in Cambridge, England, carried out a far-reaching set of experiments that led to the preparation of antibodies directed against specific antigens. To understand their work, it is necessary to digress briefly. A given clone of antibody-producing cells (which is derived from a single B lymphocyte) synthesizes antibodies having identical antigencombining sites. The heterogeneity of the antibodies produced when a single purified antigen is injected into an animal results from the fact that many different B lymphocytes are activated, each having membrane-bound antibodies with an affinity for a different part of the antigen. An important question arose: was it possible to circumvent this problem and obtain a single species of antibody molecule? Consider for a moment the results of a procedure in which one injects an animal with a purified antigen, waits a period of weeks for antibodies to be produced, removes the spleen or other lymphoid organs, prepares a suspension of single cells, isolates those cells producing the desired antibody, and grows these particular

Chapter 18 Techniques in Cell and Molecular Biology

782

cells as separate colonies so as to obtain large quantities of this particular immunoglobulin. If this procedure were followed, one should obtain a preparation of antibody molecules produced by a single colony (or clone) of cells, which is referred to as a monoclonal antibody. But antibody-producing cells do not grow and divide in culture, so an additional manipulation had to be introduced to obtain monoclonal antibodies. Malignant myeloma cells are a type of cancer cell that grows rapidly in culture and produces large quantities of antibody. But myeloma cells are of little use as analytic tools because they are not formed in response to a particular antigen. Instead, myeloma cells develop from the random conversion of a normal lymphocyte to the malignant state, and therefore, they produce the antibody that was being synthesized by the particular lymphocyte before it became malignant. Milstein and Köhler combined the properties of these two types of cells, the normal antibody-producing lymphocyte and the immortal myeloma cell. They accomplished this feat by fusing the two types of cells to produce hybrid cells, called hybridomas, that grow and proliferate indefinitely, producing large amounts of a single (monoclonal) antibody. The antibody produced is one that was being synthesized by the normal lymphocyte prior to fusion with the myeloma cell. The procedure for the production of monoclonal antibodies is illustrated in Figure 18.52. The antigen (whether present in a soluble form or as part of a cell) is injected into the mouse to cause the proliferation of specific antibody-forming cells (step 1, Figure 18.52). After a period of several weeks, the spleen is removed and dissociated into single cells (step 2), and the antibody-producing lymphocytes are fused with a population of malignant myeloma cells (step 3), which makes the hybrid cells immortal, that is, capable of unlimited cell division. Hybrid cells (hybridomas) are selected from nonfused cells by placing the cell mixture in a medium in which only the hybrids can survive (step 4). Hybridomas are then grown clonally in separate wells (step 5) and individually screened for production of antibody against the antigen being studied. Hybrid cells containing the appropriate antibody (step 6) can then be cloned in vitro or in vivo (by growing as tumor cells in a recipient animal), and unlimited amounts of the monoclonal antibody can be prepared. Once the hybridomas have been produced, they can be stored indefinitely in a frozen state, and aliquots can be made available to researchers around the world. One of the most important features of this methodology is that one does not have to begin with a purified antigen to obtain an antibody. In fact, the antigen to which the monoclonal antibody is sought may even be a minor component of the entire mixture. Once a preparation of antibody molecules is in hand, whether obtained by conventional immunologic techniques or by formation of hybridomas, these molecules can be used as highly specific probes in a variety of analytic techniques. For example, antibodies can be used in protein purification. When a purified antibody is added to a crude mixture of proteins, the specific protein being sought selectively combines with the antibody and precipitates from solution. Antibodies can also be used in conjunction with various types of fractionation procedures to identify a particular protein (antigen) among a mixture of proteins. In a Western blot, for example, a mixture of

1

Mouse being immunized with antigen

Myeloma cells in culture. Cells are HGPRT-mutants and unable to grow in HAT medium.

Remove spleen

2

Spleen-cell suspension

3

Fusion of cells in polyethylene glycol Add cells to HAT selection medium

4

Hybrid cells (hybridomas) grow while unfused cells die

5

Test supernatant in which cells are growing for presence of desired antibody. Then culture cells that produce this antibody.

6

Culture clone of cells that produces antibody against injected antigen.

Figure 18.52 Formation of monoclonal antibodies. The steps are described in the text. The HAT medium is so named because it contains hypoxanthine, aminopterin, and thymidine. This medium allows cells with a functional hypoxanthine–guanine phosphoribosyl transferase (HGPRT) to grow, but it does not support the growth of cells lacking this enzyme, such as the unfused myeloma cells used in this procedure.

proteins is first fractionated by two-dimensional gel electrophoresis (as in Figure 18.29). The fractionated proteins are then transferred to a sheet of nitrocellulose filter, which is

783

then incubated with a preparation of antibodies that have been labeled either radioactively or fluorescently. The location on the filter of the specific protein bound by the antibody can be determined from the location of the bound radioactivity or fluorescence. In addition to their use in research, monoclonal antibodies are playing a valuable role in diagnostic medicine in tests to determine the concentration of specific proteins in the blood or urine. In one example, monoclonal antibodies form the basis of certain home-pregnancy tests that monitor the presence of a protein (chorionic gonadotrophin) that appears in the urine a few days after conception. Monoclonal antibodies have also become very useful as therapeutic agents in treating human disease (page 688). To date, efforts to develop human hybridomas that produce human monoclonal antibodies have not been successful. To get around this stumbling block, mice have been genetically engineered so that the antibodies they produce are increasingly human in amino acid sequence. A number of these humanized monoclonal antibodies have been approved for the treatment of several diseases. More recently, mice have been engineered so that their immune system is essentially “human” in nature. These animals produce antibodies that are fully human in structure. The first fully human antibody (Humira), which is approved for the treatment of rheumatoid arthritis (page 726), was produced by a very different technique that uses bacteriophage rather than hybridomas as the basis for monoclonal antibody production. The technique is known as phage display. In this technique, billions of different phage particles are generated in which each phage carries a gene encoding a human antibody molecule that possesses a unique variable region (page 712). Different phage within this vast library encode different antibodies, that is, antibodies with different variable regions. In each case, the antibody gene is fused to a gene encoding one of the viral coat proteins so that when the phage is assembled within a host cell, the antibody molecule is displayed on the surface of the viral particle. Suppose that you have a protein (antigen) that you believe would be a good target for a particular therapeutic antibody. The antigen is purified and allowed to interact with a sample of each of the billions of phage particles that make up the phage library. Those phage that bind to the antigen with high affinity can be identified and allowed to multiply with an appropriate host cell. Once it has

been amplified this way, the DNA encoding the antibody gene can be isolated and used to transfect an appropriate mammalian cell. The genetically engineered cells can then be grown in large cultures to produce therapeutic quantities of the antibody. Production of antibodies in cultured mammalian cells is an expensive venture and alternative “living factories” are being pursued. Among the alternatives being considered for this purpose are goats, rabbits, and tobacco cells. A quick scan through this text will reveal numerous micrographs depicting the immunolocalization of a particular protein within a cell as visualized in the light or electron microscope. Immunolocalization of proteins within a cell depends on the use of antibodies that have been specifically prepared against that particular protein. Once prepared, the antibody molecules are linked (conjugated) to a substance that makes them visible under the microscope, but does not interfere with the specificity of their interactions. For use with the light microscope, antibodies are generally complexed with small fluorescent molecules, such as fluorescein or rhodamine, to generate derivatives that are then incubated with the cells or sections of cells. The binding sites are then visualized with the fluorescence microscope. This technique is called direct immunofluorescence. It is often preferable to carry out antigen localization by a variation of this technique called indirect immunofluorescence. In indirect immunofluorescence, the cells are incubated with unlabeled antibody, which is allowed to form a complex with the corresponding antigen. The location of the antigen–antibody couple is then revealed in a second step using a preparation of fluorescently labeled antibodies whose combining sites are directed against the antibody molecules used in the first step. Indirect immunofluorescence results in a brighter image because numerous secondary antibody molecules can bind to a single primary antibody. Indirect immunofluorescence also has a practical advantage; the conjugated (fluorescent) antibody is readily purchased from vendors. Immunofluorescence provides remarkable clarity because only the proteins bound by the antibody are revealed to the eye; all of the unlabeled materials remain invisible. Localization of antigens in the electron microscope is accomplished using antibodies that have been tagged with electron-dense materials, such as the iron-containing protein ferritin or gold particles. An example of this technique is shown in Figure 8.23c,d.

18.18 The Use of Antibodies

This page is intentionally left blank

Glossary Many key terms and concepts are defined here. The parenthetical numbers following most definitions identify the chapter and section in which the term is first defined. For example, a term followed by (3.2) is first defined in Chapter 3, Section 2: “Enzymes.” Terms defined in The Human Perspective or Experimental Pathways essays are identified parenthetically with HP or EP. For example, a term followed by (EP1) is first defined in the Experimental Pathways essay of Chapter 1. A (aminoacyl) site Site into which aminoacyl-tRNAs enter the ribosome-mRNA complex. (11.8) Absorption spectrum A plot of the intensity of light absorbed relative to its wavelength. (6.3) Acetyl CoA A metabolic intermediate produced through catabolism of many compounds, including fatty acids, and used as the initial substrate for the central respiratory pathway, the TCA cycle. (5.2) Acid A molecule that is capable of releasing a hydrogen ion. (2.3) Acid hydrolases Hydrolytic enzymes with optimal activity at an acid pH. (8.6) Actin A globular, cytoskeletal protein that polymerizes to form a flexible, helical filament capable of interacting with myosin. Actin filaments provide mechanical support for eukaryotic cells, determine the cell’s shape, and enable cell movements. (9.5) Actin-binding proteins Any of nearly 100 different proteins belonging to numerous families that affect the assembly of actin filaments, their physical properties, and their interactions with one another and with other cellular organelles. (9.7) Action potential The collective changes in membrane potential, beginning with depolarization to threshold and ending with return to resting potential, that occur with stimulation of an excitable cell and act as the basis for neural communication. (4.8) Action spectrum A plot of the relative rate (or efficiency) of a process produced by light of various wavelengths. (6.3) Activation energy The minimal kinetic energy needed for a reactant to undergo a chemical reaction. (3.2) Active site The part of an enzyme molecule that is directly involved in binding the substrate. (3.2) Active transport The energy-requiring process in which a substance binds to a specific transmembrane protein, changing its conformation to allow passage of the substance through the membrane against

the electrochemical gradient for that substance. (4.7) Adaptive immune response A specific response to a pathogen that requires previous exposure to that agent. Includes responses mediated by antibodies and T lymphocytes. (17) Adenosine triphosphate (ATP) Nucleotide consisting of adenosine bonded to three phosphate groups; it is the principal immediate-energy source for prokaryotic and eukaryotic cells. (2.5, 3) Adherens junctions (zonulae adherens) Adherens junctions are a type of specialized adhesive junction particularly common in epithelia. The plasma membranes in this region are separated by 20 to 35 nm and are sites where cadherin molecules are concentrated. The cells are held together by linkages between the extracellular domains of cadherin molecules that bridge the gap between neighboring cells. (7.3) Aerobes Organisms dependent on the presence of oxygen to metabolize energy-rich compounds. (5.1) Affinity chromatography Protein purification technique that utilizes the unique structural properties of a protein that allow the molecule to be specifically withdrawn from solution while all other molecules remain in solution. The solution is passed through a column in which a specific interacting molecule (such as a substrate, ligand, or antigen) is immobilized by linkage to an inert material (the matrix). (18.7) Alleles Alternate forms of the same gene. (10.1) Allosteric modulation Modification of the activity of an enzyme through interaction with a compound that binds to a site (i.e., allosteric site) other than the active site. (3.3) Alpha (␣) helix One possible secondary structure of polypeptides, in which the backbone of the chain forms a spiral (i.e., helical) conformation. (2.5) Alternative splicing Widespread mechanism by which a single gene can encode two or more related proteins. (12.5) Amide bond The chemical bond that forms between carboxylic acids and amines (or acidic and amino functional groups) while producing a molecule of water. (2.4) Amino acids The monomeric units of proteins; each is composed of three functional groups attached to a central, ␣ carbon: an amino group, a defining side chain, and a carboxyl group. (2.5)

Aminoacyl-tRNA synthetase (AARS) An enzyme that covalently links amino acids to the 3⬘ ends of their cognate tRNA(s). Each amino acid is recognized by a specific aminoacyl-tRNA synthetase. (11.7) Amphipathic The biologically important property of a molecule having both hydrophobic and hydrophilic regions. (2.5, 4.3) Amphoteric Structural property allowing the same molecule to act as an acid or as a base. (2.3) Anabolic pathway A metabolic pathway resulting in the synthesis of relatively complex products. (3.3) Anaerobes Organisms that utilize energy-rich compounds through oxygen-independent metabolic pathways such as glycolysis and fermentation. (5.1) Anaphase The stage of mitosis during which sister chromatids separate from one another. (14.2) Anaphase A The movement of the chromosomes toward the poles during mitosis. (14.2) Anaphase B Elongation of the mitotic spindle, which causes the two spindle poles to move farther apart. (14.2) Aneuploidy A condition in which a cell has an abnormal number of chromosomes that is not a multiple of the haploid number. (16.1) Angiogenesis The formation of new blood vessels. (16.4) Angstrom The unit, equivalent to 0.1 nm, used to describe atomic and molecular dimensions. (1.3) Animal models Laboratory animals that exhibit characteristics of a particular human disease. (HP2) Anion An ionized atom or molecule with a net negative charge. (2.1) Antenna The light harvesting molecules of a photosynthetic unit that absorbs photons of varying wavelengths and transfers the energy to the pigment molecules at the reaction center. (6.4) Antibody An immunoglobulin protein produced by plasma cells derived from B lymphocytes that interacts with the surface of a pathogen or foreign substance to facilitate its destruction. (17.4) Anticodon A three-nucleotide sequence in each tRNA that functions in the recognition of the complementary mRNA codon. (11.7) Antigen Any substance recognized by an immune system as foreign to the organism. (17.2)

G-1

G-2 GLOSSARY Antigen-presenting cells (APCs) Cells that display portions of protein antigens at their surface. The portions are derived by proteolysis of larger antigens. The antigenic peptides are displayed in conjunction with MHC molecules. Virtually any cell in the body can function as an APC by displaying peptides in combination with MHC class I molecules, which provides a mechanism for the destruction of infected cells. In contrast, macrophages, dendritic cells, and B cells are referred to as “professional” APCs because they function to ingest materials, process them, and display them to TH cells in combination with MHC class II molecules. (17.4) Antiserum Fluid containing desired antibodies that remains after the removal of the cells and clotting factors from whole blood that has been exposed to given antigen. (18.18) Apoptosis A type of orderly, or programmed, cell death in which the cell responds to certain signals by initiating a normal response that leads to the death of the cell. Death by apoptosis is characterized by the overall compaction of the cell and its nucleus, the orderly dissection of the chromatin into pieces at the hands of a special DNA-splitting endonuclease, and the rapid engulfment of the dying cell by phagocytosis. (15.8) Artifact A structure seen in a microscopic image that results from the coagulation or precipitation of materials that had no existence in the living cell. (18.2) Assay Some identifiable feature of a specific protein, such as the catalytic activity of an enzyme, used to determine the relative amount of that protein in a sample. (18.7) Aster “Sunburst” arrangement of microtubules around each centrosome during mitosis. (14.2) Atomic force microscope (AFM) A high-resolution scanning instrument that is becoming increasingly important in nanotechnology and molecular biology. The AFM operates by scanning a delicate probe over the surface of the specimen. (18.3) ATP synthase The ATP-synthesizing enzyme of the inner mitochondrial membrane, which is composed of two chief components: the F1 headpiece and the F0 basal piece, the latter of which is embedded in the membrane itself. (5.5) Autoantibodies Antibodies capable of reacting with the body’s own tissues. (HP17) Autoimmune diseases Diseases characterized by an attack of the immune system against the body’s own tissues. Includes multiple sclerosis, insulin-dependent diabetes mellitus, and rheumatoid arthritis. (HP17)

Autophagy The destruction of organelles and their replacement during which an organelle is surrounded by a double membrane. The membrane surrounding the organelle then fuses with a lysosome. (8.6) Autoradiography A technique for visualizing biochemical processes by allowing an investigator to determine the location of radioactively labeled materials within a cell. Tissue sections containing radioactive isotopes are covered with a thin layer of photographic emulsion, which is exposed by radiation emanating from the tissue. Sites in the cells containing radioactivity are revealed under the microscope by silver grains after development of the overlying emulsion. (8.2, 18.4) Autotroph Organism capable of surviving on CO2 as its principal carbon source. (6.9) Axon A single, prominent extension that emerges from the cell body and conducts outgoing impulses away from the cell body and toward the target cell(s). (4.8) Axonal transport Process by which vesicles, cytoskeletal polymers, and macromolecules are moved along microtubules within the axon of a neuron. Anterograde transport moves materials from the cell body toward the synaptic terminals, whereas retrograde transport moves materials in the opposite direction. (9.3) Axoneme The central, microtubule-containing core of a cilium or flagellum. Most axonemes consist of nine peripheral doublets, two central microtubules, and numerous accessory structures. (9.3) B lymphocytes (B cells) Lymphocytes that respond to antigen by proliferating and differentiating into plasma cells that secrete blood-borne antibodies. These cells attain their differentiated state in the bone marrow. (17.2) Bacterial artificial chromosome (BAC) Cloning vector capable of accepting large foreign DNA fragments that can be cloned in bacteria. Consists of an F plasmid with an origin of replication and the genes required to regulate replication. BACs have played a key role in genome sequencing. (18.15) Bacteriophages A group of viruses that require bacteria as host cells. (1.4) Basal body A structure that resides at the base of the cilium or flagellum and which generates their outer microtubules. Basal bodies are identical in structure to centrioles. Both can give rise to one another. (9.3) Base Any molecule that is capable of accepting a hydrogen ion. (2.3) Base composition analysis The relative amounts of each base in various samples of DNA. (10.3)

Base excision repair (BER) A cut-and-patch mechanism for the removal from the DNA of altered nucleotides, e.g., uracil (formed from cytosine) and 8-oxoguanine. (13.2) Basement membrane (basal lamina) Thickened layer of approximately 50 to 200 nm of extracellular matrix that surrounds muscle and fat cells and underlies the basal surface of epithelial tissues such as the skin, the inner lining of the digestive and respiratory tracts, and the inner lining of blood vessels. (7.1) Benign tumor A tumor composed of cells that are no longer responsive to normal growth controls but lack the capability to invade normal tissues or metastasize to distant sites. (16.3) Beta (␤) clamp One of the noncatalytic components of the replisome that encircles the DNA and keeps the polymerase associated with the DNA template. (13.1) Beta (␤) pleated sheet One possible secondary structure of a polypeptide, in which several ␤-strands lie parallel to each other, creating the conformation of a sheet. (2.5) Beta (␤) strand One possible secondary structure of a polypeptide, in which the backbone of the chain assumes a folded (or pleated) conformation. (2.5) Biochemicals Compounds synthesized by living organisms. (2.4) Bioenergetics The study of the various types of energy transformations that occur in living organisms. (3.1) Biosynthetic pathway (secretory pathway) Route through the cytoplasm by which materials are synthesized in the endoplasmic reticulum or Golgi complex, modified during passage through the Golgi complex, and transported within the cytoplasm to various destinations such as the plasma membrane, a lysosome, or a large vacuole of a plant cell. The alternative term secretory pathway has been used because many of the materials synthesized in the pathway are destined to be discharged (secreted) outside the cell. (8.1) Bivalent (tetrad) The complex formed during meiosis by a pair of synapsed homologous chromosomes. (14.3) Bright-field microscope A microscope in which light from the illuminating source is caused to converge on the specimen by the substage condenser, thereby forming a cone of bright light that can enter the objective lens. (18.1) Buffers Compounds that can interact with either free hydrogen or hydroxyl ions, minimizing a change in pH. (2.3)

GLOSSARY G-3 C3 pathway The metabolic pathway by which carbon dioxide is assimilated into the organic molecules of the cell during photosynthesis. Ribulose 1,5-bisphosphate (RuBP) is used by Rubisco as the initial CO2 acceptor. The product then fragments into two three-carbon PGA molecules. (6.6) C3 plants Plants that depend solely on the C3 pathway to fix atmospheric CO2. (6.6) C4 pathway Alternate pathway for carbon fixation utilizing phosphoenolpyruvate as the CO2 acceptor to produce four-carbon compounds (predominantly malate and oxaloacetate). (6.6) C4 plants Plants, primarily tropical grasses, that use the C4 carbon fixation pathway. (6.6) Cadherins A family of related glycoproteins that mediate Ca2⫹-dependent cell–cell adhesion. (7.3) Calcium-binding proteins Proteins, such as calmodulin, that bind calcium and allow the calcium to elicit a variety of cellular responses. (15.5) Calmodulin A small, calcium-binding protein that is widely distributed. Each molecule of calmodulin contains four binding sites for calcium. (15.5) Calvin cycle (or Calvin-Benson cycle) The pathway for conversion of CO2 into carbohydrate; the cycle occurs in cyanobacteria and all eukaryotic photosynthetic cells. (6.6) CAM plants Plants that utilize PEP carboxylase in fixing CO2 just as C4 plants do, but conduct the light-dependent reactions and carbon fixation at different times of the day, so that stomata can be closed during the peak water loss hours of the day. (6.6) Carbohydrates (Glycans) Organic molecules including simple sugars (monosaccharides) and multisaccharide polymers, which largely serve as energy-storage and structural compounds in cells. (2.5) Caspases A family of cysteine proteases that are activated at an early stage of apoptosis and are responsible for the degradative events observed during cell death. (15.8) Catabolic pathway A metabolic pathway in which relatively complex molecules are broken down into simpler products. (3.3) Cation An ionized atom or molecule with an extra positive charge. (2.1) Cell culture The technique used to grow cells outside the organism. (18.5) Cell cycle The stages through which a cell passes from one cell division to the next. (14.1) Cell division The process by which new cells originate from other living cells. (14)

Cell fractionation Bulk separation of the various cell organelles by differential centrifugation. (8.2) Cell-free system An experimental system to study cellular activities that does not require whole cells. Such systems typically contain a preparation of purified proteins and/or subcellular fractions and are amenable to experimental manipulation. (8.2) Cell fusion Technique whereby two different types of cells (from one organism or from different species) are joined to produce one cell with one, continuous plasma membrane. (4.6) Cell line Cells that are commonly used in tissue culture studies that have undergone genetic modifications that allow them to be grown indefinitely. (18.5) Cell-mediated immunity Carried out by T lymphocytes, that when activated, can specifically recognize and kill an infected (or foreign) cell. (17.1) Cell plate Structure between the cytoplasm of two newly formed daughter cells that gives rise to a new cell wall in plant cells. (7.6, 14.3) Cell signaling Communication in which information is relayed across the plasma membrane to the cell interior and often to the cell nucleus by means of a series of molecular interactions. (15) Cell theory Theory of biological organization, which has three tenets: all organisms are made up of one or more cells; the cell is the structural unit of life; cells only arise from the division of preexisting cells. (1.1) Cell wall A rigid, nonliving structure that provides support and protection for the cell it surrounds. (7.6) Cellulose Unbranched glucose polymer with ␤(1n4) linkages that assembles into cables and serves as a principal structural element of plant cell walls. (2.5) Centrioles Cylindrical structures, about 0.2 ␮m across and typically about twice as long, that contain nine evenly spaced fibrils, each of which appears in cross section as a band of three microtubules. Centrioles are nearly always found in pairs, with each of the members situated at right angles to one another. (9.3) Centromere Marked indentation on a mitotic chromosome that serves as the site of kinetochore formation. (12.1) Centrosome A complex structure that contains two barrel-shaped centrioles surrounded by amorphous, electron-dense pericentriolar material (or PCM) where microtubules are nucleated. (9.3) Chaperones Proteins that bind to other polypeptides, preventing their aggregation and promoting their folding and/or assembly into multimeric proteins. (EP2)

Chaperonins Members of the Hsp60 class of chaperones, e.g., GroEL, that form a cylindrical complex of 14 subunits within which the polypeptide folding reaction takes place. (EP2) Checkpoint Mechanisms that halt the progress of the cell cycle if (1) any of the chromosomal DNA is damaged, or (2) certain critical processes, such as DNA replication or chromosome alignment during mitosis, have not been properly completed. (14.1) Chemiosmotic mechanism The mechanism for ATP synthesis whereby the movement of electrons through the electron-transport chain results in establishment of a proton gradient across the bacterial, thylakoid, or inner mitochondrial membrane, with the gradient acting as a high-energy intermediate, linking oxidation of substrates to the phosphorylation of ADP. (EP5) Chemoautotroph An autotroph that utilizes the energy stored in inorganic molecules (such as ammonia, hydrogen sulfide, or nitrites) to convert CO2 into organic compounds. (6) Chiasmata Specific points of attachment between homologous chromosomes of bivalents, observed as the homologous chromosomes move apart at the beginning of the diplotene stage of meiotic prophase 1. The chiasmata are located at the sites on the chromosomes at which genetic exchange during crossing over had previously occurred. (14.3) Chlorophyll The most important lightabsorbing photosynthetic pigment. (6.3) Chloroplast A specialized membrane-bound cytoplasmic organelle that is the princip al site of photosynthesis in eukaryotic cells. (6) Cholesterol Sterol found in animal cells that can constitute up to half of the lipid in a plasma membrane, with the relative proportion in any membrane affecting its fluid behavior. (4.3) Chromatid Paired, rod-shaped members of mitotic chromosomes that together represent the duplicated chromosomes formed during replication in the previous interphase. (14.2) Chromatin A complex nucleoprotein material that makes up the chromosomes of eukaryotes. (1.3) Chromatin remodeling complexes Multisubunit protein complexes that use the energy released by ATP hydrolysis to alter chromatin structure and allow binding of transcription factors to regulatory sites in the DNA. (12.4) Chromatography A term used for a wide variety of techniques in which a mixture of dissolved components is fractionated as it moves through some type of immobile matrix. (18.7)

G-4 GLOSSARY Chromosome compaction (chromosome condensation) A process in which a cell converts its chromosomes into shorter, thicker structure capable of being segregated during mitosis or meiosis. (14.2) Chromosomes Threadlike strands that are composed of the nuclear DNA of eukaryotic cells and are the carriers of genetic information. (10.1) Cilia Hairlike motile organelles that project from the surface of a variety of eukaryotic cells. Cilia tend to occur in large numbers on a cell’s surface. (9.3) Ciliary or axonemal dynein A huge protein (up to 2 million daltons) responsible for conversion of the chemical energy of ATP into the mechanical energy of ciliary locomotion. (9.3) Cis cisternae The cisternae of the Golgi complex closest to the endoplasmic reticulum. (8.4) Cis Golgi network (CGN) The cis-most face of the organelle that is composed of an interconnected network of tubules located at the entry face closest to the ER. The CGN is thought to function primarily as a sorting station that distinguishes between proteins to be shipped back to the ER and those that are allowed to proceed to the next Golgi station. (8.4) Cisternal (luminal) space The region of the cytoplasm enclosed by the membranes of the endoplasmic reticulum or Golgi complex. (8.3) Clonal selection theory Well supported theory that B and T lymphocytes develop their ability to produce specific antibodies or T cell receptors prior to exposure to antigen. Should an antigen then enter the body, it can interact specifically with those B and T lymphocytes bearing complementary receptors. Interaction between the antigen and the B or T lymphocytes leads to proliferation of the lymphocyte to form a clone of cells capable of responding to that specific antigen. (17.2) Coactivators The intermediaries that help bound transcription factors stimulate the initiation of transcription at the core promoter. (12.4) Coated pits Specialized domains of the plasma membrane; coated pits serve as collection points for receptors that bind substances that enter a cell by means of endocytosis. (8.8) Coated vesicles Vesicles that bud from a membrane compartment typically possess a multi-subunit protein coat that promotes the budding process and binds specific membrane proteins. COPI-, COPII-, and clathrin-coated vesicles are the best characterized coated vesicles. (8.5)

Codons Sequences of three nucleotides (nucleotide triplets) in mRNAs that specify amino acids. (11.6) Coenzyme An organic, nonprotein component of an enzyme. (3.2) Cofactor The nonprotein component of an enzyme, it can be either inorganic or organic. (3.2) Cohesin A multiprotein complex that keeps replicated chromatids associated with one another until they are separated during cell division. (14.2) Collagens A family of fibrous glycoproteins known for their high tensile strength that function exclusively as part of the extracellular matrix. (7.1) Competitive inhibitor An enzyme inhibitor that competes with substrate molecules for access to the active site. (3.2) Complement A system of blood-plasma proteins that act as part of the innate immune system to destroy invading microorganisms either directly (by making their plasma membrane porous) or indirectly (by making them susceptible to phagocytosis). (17.1) Complementary The relationship between the sequence of bases in the two strands of the double helix of DNA. Structural restrictions on the configurations of the bases limits bonding to the two pairs: adenine-thymine and guanine-cytosine. (10.2) Conductance The movement of small ions across membranes. (4.7) Confocal scanning microscope A microscope in which the specimen is illuminated by a finely focused laser beam that rapidly scans across the specimen at a single depth, thus illuminating only a thin plane (or “optical section”) within the object. The microscope is typically used with fluorescently stained specimens, and the light emitted from the illuminated optical section is used to form an image of the section on a video screen. (17.1) Conformation The three-dimensional arrangement of the atoms within a molecule, often important in understanding the biological activity of proteins and other molecules in a living cell. (2.5) Conformational change A predictable movement within a molecule that is associated with biological activity. (2.5) Congression The movement of duplicated chromosomes to the metaphase plate during prometaphase of mitosis. (14.2) Conjugate acid Paired form created when a base accepts a proton in an acid-base reaction. (2.3) Conjugate base Paired form created when an acid loses a proton in an acid-base reaction. (2.3)

Conjugated protein Protein linked covalently or noncovalently to substances other than amino acids, such as metals, nucleic acids, lipids, and carbohydrates. (2.5) Connective tissue Tissue that consists largely of a variety of distinct fibers that interact with each other in specific ways. The deeper layer of the skin (the dermis) is a type of connective tissue. (7.0) Connexon Multisubunit complex of a gap junction formed from the clustering within the plasma membrane of an integral membrane protein called connexin. Each connexon is composed of six connexin subunits arranged around a central opening (or annulus) about 1.5 nm in diameter. (7.5) Consensus sequence The most common version of a conserved sequence. The TTGACA sequence of a bacterial promoter (known as the -35 element) is an example of a consensus sequence. (11.2) Conserved sequences Refers to the amino acid sequences of particular polypeptides or the nucleotide sequences of particular nucleic acids. If two sequences are similar to one another, i.e., homologous, they are said to be conserved, which indicates that they have not diverged very much from a common ancestral sequence over long periods of evolutionary time. (2.4) Constant regions The portions of light and heavy antibody polypeptide chains that have the same amino acid sequence. (17.4) Constitutive Occurs in a continual, non-regulated manner. Can relate to a normal process, such as constitutive secretion, or the result of a mutation that leads to a breakdown in regulation, which causes continual activity, such as the constitutive activation of a signaling pathway. Constitutive heterochromatin Chromatin that remains in the compacted state in all cells at all times and, thus, represents DNA that is permanently silenced. It consists primarily of highly repeated sequences. (12.1) Constitutive secretion Discharge of materials synthesized in the cell into the extracellular space in a continual manner. (8.1) Contrast The difference in appearance between adjacent parts of an object or an object and its background. (18.1) Conventional (or type II) myosins A family of myosins, first identified in muscle tissue, that are the primary motors for muscle contraction but are also found in a variety of nonmuscle cells. Type II myosins are needed for splitting a cell in two during cell division, generating tension at focal adhesions, and the turning behavior of growth cones. (9.5)

GLOSSARY G-5 Copper atoms of the electron transport chain A type of electron carrier; these atoms are located within a single protein complex of the inner mitochondrial membrane that accept and donate a single electron as they alternate between Cu2⫹ and Cu1⫹ states. (5.3) Cotransport A process that couples the movement of two solutes across a membrane, termed symport if the two solutes are moved in the same direction and antiport if the two move in opposite directions. (4.7) Covalent bond The type of chemical bond in which electron pairs are shared between two atoms. (2.1) Cristae The many deep folds that are characteristic of the inner mitochondrial membrane, which contain the molecular machinery of oxidative phosphorylation. (5.1) Crossing over (genetic recombination) Reshuffling of the genes on chromosomes (thereby disrupting linkage groups) that occurs during meiosis as a result of breakage and reunion of segments of homologous chromosomes. (10.1) Crystal structure A structural model based on X-ray crystallography (18.8) Cyanobacteria Evolutionarily important and structurally complex prokaryotes that possess oxygen-producing photosynthetic membranes. (1.3) Cyclic photophosphorylation The formation of ATP in chloroplasts that is carried out by PSI independent of PSII. (6.5) Cyclin-dependent kinases (Cdks) Enzymes that control progression of cells through the cell cycle. (14.1) Cytochromes A type of electron carrier consisting of a protein bound to a heme group. (5.3) Cytokines Proteins secreted by cells of the immune system that alter the behavior of other immune cells. (17.3) Cytokinesis The part of the cell cycle during which physical division of the cell into two daughter cells occurs. (14.2) Cytoplasmic dynein A huge protein (molecular mass over 1 million daltons) composed of numerous polypeptide chains. The molecule contains two large globular heads that act as force-generating engines. Evidence suggests that cytoplasmic dynein functions in the movement of chromosomes during mitosis and also as a minus end-directed microtubular motor for the movement of vesicles and membranous organelles through the cytoplasm. (9.3) Cytoskeleton An elaborate interactive network composed of three well-defined filamentous structures: microtubules, microfilaments, and intermediate filaments. These elements function to provide structural support; an internal framework

responsible for positioning the various organelles within the interior of the cell; as part of the machinery required for movement of materials and organelles within cells; as force-generating elements responsible for the movement of cells from one place to another; as sites for anchoring messenger RNAs and facilitating their translation into polypeptides; and as a signal transducer, transmitting information from the cell membrane to the cell interior. (9) Cytosol The region of fluid content of the cytoplasm outside of the membranous organelles of a eukaryotic cell. (1.3) Cytosolic surface The surface of a membrane that faces the cytosol. (8.3) Dalton A measure of molecular mass, with one dalton equivalent to one unit of atomic mass (e.g., the mass of a 1H atom). (1.4) Dehydrogenase An enzyme that catalyzes a redox reaction by removing a hydrogen atom from one reactant. (3.3) Deletion A chromosomal aberration that occurs when a portion of a chromosome is missing. (HP12) Deletion Loss of a segment of DNA caused by the misalignment of homologous chromosomes during meiosis. (10.4) Denaturation Separation of the DNA double helix into its two component strands. (10.4) Denaturation The unfolding or disorganization of a protein from its native or fully folded state. (2.5) Dendrites Fine extensions from the cell bodies of most neurons; dendrites receive incoming information from external sources, typically other neurons. (4.8) Deoxyribonucleic acid (DNA) A double-stranded nucleic acid composed of two polymeric chains of deoxyribose-containing nucleotides. The genetic material of all cellular organisms (2.5) DNA may be duplicated as in DNA replication (13) and produced in large quantities of a specific segment as in DNA cloning (18.11). Depolarization A decrease in the electrical potential difference across a membrane. (4.8) Desmosomes (maculae adherens) Disc-shaped adhesive junction containing cadherins found in a variety of tissues, but most notably epithelia, where they are located basal to the adherens junction. Dense cytoplasmic plaques on the inner surface of the plasma membranes in this region serve as sites of anchorage for looping intermediate filaments that extend into the cytoplasm. (7.3) Differential centrifugation A technique used to isolate a particular organelle in bulk quantity, which depends on the principle that, as long as they are more dense than the

surrounding medium, particles of different size and shape travel toward the bottom of a centrifuge tube at different rates when placed in a centrifugal field. (18.6) Differentiation The process through which unspecialized cells become more complex and specialized in structure and function. (1.3) Diffusion The spontaneous process in which a substance moves from an area of higher concentration to one of lower concentration, eventually reaching the same concentration in all areas. (4.7) Diploid Containing both members of each pair of homologous chromosomes, as exemplified by most somatic cells. Diploid cells are produced from diploid parental cells during mitosis. Contrast with haploid. (10.5, 14.3) Direct immunofluorescence Technique for the intracellular localization of an antigen, in which antibodies are conjugated with small fluorescent molecules to form derivatives that are then incubated with the cells or sections of cells. The binding sites are then visualized with the fluorescence microscope. (18.18) Disulfide bridge Forms between two cysteines that are distant from one another in the polypeptide backbone or in two separate polypeptides. They help stabilize the intricate shapes of proteins. (2.5) DNA gyrase A type II topoisomerase that is able to change the state of supercoiling in a DNA molecule by relieving the tension that builds up during replication. It does this by traveling along the DNA and acting like a “swivel,” changing the positively supercoiled DNA into negatively supercoiled DNA. (13.1) DNA ligase The enzyme responsible for joining DNA fragments into a continuous strand. (13.1) DNA methylation An epigenetic process in which methyl groups are added to cytosine residues in DNA by DNA methyltransferases. In vertebrates, DNA methylation occurs on certain CpG residues in the promoter regions of genes and is associated with inactivation of gene expression. Also associated more widely with preventing transposition of mobile genetic elements. (12.4) DNA microarrays “DNA chips” prepared by spotting DNAs from different genes in a known, ordered location on a glass slide. The slide is then incubated with fluorescently labeled cDNAs whose hybridization level provides a measure of the level of expression of each gene in the array. (12.4, 16.3) DNA polymerases The enzymes responsible for constructing new DNA strands during replication or DNA repair. (13.1)

G-6 GLOSSARY DNA-dependent RNA polymerases (RNA polymerases) The enzymes responsible for transcription in both prokaryotic and eukaryotic cells. (11.2) DNA tumor viruses Viruses capable of infecting vertebrate cells, transforming them into cancer cells. DNA viruses have DNA in the mature virus particle. (16.2) Dolichol phosphate Hydrophobic molecule built from more than 20 isoprene units that assembles the basal, or core, segment of carbohydrate chains within glycoproteins. (8.3) Domain A region within a protein (or RNA) that folds and functions in a semi-independent manner. (2.5) Double-strand breaks (DSBs) DNA damage often resulting from ionizing radiation involving a fracture of both strands of the double helix. DSBs can be debilitating to a cell and at least two distinct repair systems are devoted to their repair. (13.2) Dynamic instability A term that relates to the assembly/disassembly properties of the plus end of a microtubule. The term describes the fact that growing and shrinking microtubules can coexist in the same region of a cell, and that a given microtubule can switch back and forth unpredictably between growing and shortening phases. (9.3) Dynein An exceptionally large, cargo-carrying, multisubunit motor protein that moves along microtubules toward their minus end. This family of proteins occurs as cytoplasmic dyneins and ciliary or axonemal dyneins. (9.3) Effector A substance that brings about a cellular response to a signal. (15.1) Electrochemical gradient The overall difference in electrical charge and in solute concentration that determines the ability of an electrolyte to diffuse between two compartments. (4.7) Electrogenic Any process that contributes directly to a separation of charge across a membrane. (4.7) Electron-transfer potential The relative affinity for electrons, such that a compound with a low affinity has a high potential to transfer one or more electrons in a redox reaction (and thus act as a reducing agent). (5.3) Electron-transport or respiratory chain Membrane-embedded electron carriers that accept high-energy electrons and sequentially lower the energy state of the electrons as they pass through the chain, with the net results of capturing energy for use in synthesizing ATP or other energy-storage molecules. (5.3) Electronegative atom The atom with the greater attractive force; the atom that can capture the major share of electrons of a covalent bond. (2.1)

Electrophoresis Fractionation techniques that depend on the ability of charged molecules to migrate when placed in an electric field. (18.7) Embryonic stem (ES) cells A type of cell that has virtually unlimited powers of differentiation, found in the mammalian blastocyst, which is an early stage of embryonic development comparable to the blastula of other animals. (HP1, 18.17) Endergonic reactions Reactions that are thermodynamically unfavorable and cannot occur spontaneously, possessing a ⫹⌬G value. (3.1) Endocytic pathway Route for moving materials from outside the cell (and from the membrane surface of the cell) to compartments, such as endosomes and lysosomes, located within the cell interior. (8.1, 8.8) Endocytosis Mechanism for the uptake of fluid and solutes into a cell. Can be divided into two types: bulk-phase endocytosis, which is non-specific, and receptor-mediated endocytosis, which requires the binding of solute molecules such as LDL or transferrin to a specific cell-surface receptor. (8.8) Endomembrane system Functionally and structurally interrelated group of membranous cytoplasmic organelles including the endoplasmic reticulum, Golgi complex, endosomes, lysosomes, and vacuoles. (8) Endoplasmic reticulum (ER) A system of tubules, cisternae, and vesicles that divides the fluid content of the cytoplasm into a luminal space within the ER membrane and a cytosolic space outside the membranes. (8.3) Endosomes Organelles of the endocytic pathway. Materials taken up by endocytosis are transported to early endosomes where they are sorted, and then on to late endosomes and ultimately to lysosomes. Late endosomes also function as destination sites of lysosomal enzymes transported from the Golgi complex. (8.8) Endosymbiont theory Proposal, which is based on considerable evidence, that mitochondria and chloroplasts arose from symbiotic prokaryotes that took up residence within a primitive host cell. (EP1) Endothermic reactions Those gaining heat under conditions of constant pressure and volume. (3.1) Energy The capacity to do work, it exists in two forms: potential and kinetic. (3.1) Enhancer A regulatory site in the DNA that may be located at considerable distance either upstream or downstream from the promoter that it regulates. Binding of one or more transcriptional factors to the enhancer can dramatically increase the rate of transcription of the gene. (12.4)

Enthalpy change (⌬H ) The change during a process in the total energy content of the system. (3.1) Entropy (S) A measure of the relative disorder of the system or universe associated with random movements of matter; because all movements cease at absolute zero (0 K), entropy is zero only at that temperature. (3.1) Enzyme inhibitor Any molecule that can bind to an enzyme and decrease its activity, classified as noncompetitive or competitive based on the nature of the interaction with the enzyme. (3.2) Enzyme–substrate complex The physical association between an enzyme and its substrate(s), during which catalysis of the reaction takes place. (3.2) Enzymes The vitally important protein catalysts of cellular reactions. (3.2) Epigenetic inheritance Heritable changes, i.e., changes that can be transmitted from one cell to its progeny, that do not involve changes in DNA sequence. Epigenetic changes can result from DNA methylation, covalent modification of histones, and likely other types of chromatin modifications. (12.2, 12.4) Epithelial tissue Tissue composed of closely packed cells that line the spaces within the body. The outer layer of skin (the epidermis) is a type of epithelial tissue. (7.0) Epitope (or antigenic determinant) Portion of an antigen that binds to the antigen-combining site of a specific antibody. (17.4) Equilibrium constant of a reaction (Keq) The ratio of product concentrations to reactant concentrations when a reaction is at equilibrium. (3.1) Ester bond The chemical bond that forms between carboxylic acids and alcohols (or acidic and alcoholic functional groups) while producing a molecule of water. (2.4) Euchromatin Chromatin that returns to its dispersed state during interphase. (12.2) Eukaryotic cells Cells (e.g., plant, animal, protist, fungal) characterized by an internal structure based on organelles such as the nucleus, derived from eu-karyon, or true nucleus. (1.3) Excitation-contraction coupling The steps that link the arrival of a nerve impulse at the muscle plasma membrane to the shortening of the sarcomeres deep within the muscle fiber. (9.6) Excited state Electron configuration of a molecule after absorption of a photon has energized an electron to shift from an inner to an outer orbital. (6.3) Exergonic reactions Reactions that are thermodynamically favorable, possessing a ⫺⌬G value. (3.1)

GLOSSARY G-7 Exocytosis The process of membrane fusion and content discharge during which the membrane of a secretory granule or vesicle comes into contact with the overlying plasma membrane with which it fuses, thereby forming an opening through which the contents of the granule or vesicle can be released. (8.5) Exon shuffling The movement of genetic “modules” among unrelated genes facilitated by the presence of introns; the introns act like inert spacer elements between exons. (11.4) Exons Those parts of a split gene that contribute to a mature RNA product. (11.4) Exon-junction complex (EJC) A complex of proteins deposited on the transcript 20–24 nucleotides upstream from the newly formed exon-exon junction. This conglomeration of proteins stays with the mRNA until it is translated. (11.8) Exonuclease A DNA- or RNA-digesting enzyme that attaches to either the 5⬘ or 3⬘ end of the nucleic acid strand and removes one nucleotide at a time from that shrinking end. (e.g., 12.6, 13.1) Exothermic reactions Those releasing heat under conditions of constant pressure and volume. (3.1) Extracellular matrix (ECM) An organized network of extracellular materials that is present beyond the immediate vicinity of the plasma membrane. It may play an integral role in determining the shape and activities of the cell. (7.1) Extracellular messenger molecules The means by which cells usually communicate with each other. Extracellular messengers can travel a short distance and stimulate cells that are in close proximity to the origin of the message, or they can travel throughout the body, potentially stimulating cells that are far away from the source. (15.1) Facilitated diffusion Process by which the diffusion rate of a substance is increased through interaction with a substance-specific membrane protein. (4.7) Facilitative transporter A transmembrane protein that binds a specific substance and, in so doing, changes conformation so as to facilitate diffusion of the substance down its concentration gradient. (4.7) Facultative heterochromatin Chromatin that has been specifically inactivated during certain phases of an organism’s life. (12.2) Families Groupings of proteins that have arisen from a single ancestral gene that underwent a series of duplications and subsequent modifications during the course of evolution. (2.5) Fats Molecules consisting of a glycerol backbone linked by ester bonds to three fatty acids, also termed triacylglycerols. (2.5)

Fatty acid Long, unbranched hydrocarbon chain with a single carboxylic acid group at one end. (2.5) Feedback inhibition A mechanism to control metabolic pathways where the end product interacts with an enzyme in the pathway, resulting in inactivation of the enzyme. (3.3) Fermentation An anaerobic metabolic pathway in which pyruvate is converted to another molecule (often lactate or ethanol, depending on the organism) and NAD† is regenerated for use in glycolysis. (3.3) Fibrous protein One with a tertiary structure that is greatly elongated, resembling a fiber. (2.5) First law of thermodynamics The law of conservation of energy, which states that energy can neither be created nor destroyed. (3.1) Fixative A chemical solution that kills cells by rapidly penetrating the cell membrane and immobilizing all of its macromolecular material in such a way that the structure of the cell is maintained as close as possible to that of the living state. (18.1) Flagella Hairlike motile organelles that project from the surface of a variety of eukaryotic cells. Essentially the same structure as cilia but present in much fewer numbers. (9.3) Flavoproteins A type of electron carrier in which a polypeptide is bound to one of two related prosthetic groups, either FAD or FMN. (5.3) Fluid-mosaic model Model presenting membranes as dynamic structures in which both lipids and associated proteins are mobile and capable of moving within the membrane to engage in interactions with other membrane molecules. (4.2) Fluorescence in situ hybridization (FISH) Technique in which probe DNAs (or RNAs) are labeled with fluorescent dyes, hybridized to single-stranded DNA that remains within its chromosome, and localized with a fluorescence microscope. (10.4) Fluorescence recovery after photobleaching (FRAP) Technique to study movement of membrane components that consists of three steps: (1) linking cellular components to a fluorescent dye, (2) irreversibly bleaching (removing the visible fluorescence of ) a portion of the cell, (3) monitoring the reappearance of fluorescence (due to random movement of fluorescent-dyed components from outside of the bleached area) in the bleached portion of the cell. (4.6, 9.2) Fluorescence resonance energy transfer (FRET) A technique to measure changes in distance between two parts of a protein (or between two separate proteins within a larger structure). Based on the transfer of

energy from a donor fluorochrome to an acceptor fluorochrome, changing the fluorescence intensity of the two molecules. (Fig. 18.8) Focal adhesions Adhesive structures characteristic of cultured cells adhering to the surface of a culture dish. The plasma membrane in the region of the focal adhesion contains clusters of integrins that connect the extracellular material that coats the culture dish to the actin-containing microfilament system of the cytoskeleton. (7.2) Fractionated Disassembly of a preparation into its component ingredients so that the properties of individual species of molecules can be examined. (18.7) Frameshift mutations Mutations in which a single base pair is either added to or deleted from the DNA, resulting in an incorrect reading frame from the point of mutation through the remainder of the coding sequence. (11.8) Free energy change (⌬G) The change during a process in the amount of energy available to do work. (3.1) Free radical Highly reactive atom or molecule that contains a single unpaired electron. (HP2) Freeze-etching Technique in which tissue is freeze fractured, then exposed briefly to a vacuum so a thin layer of ice can evaporate from above and below the fractured surfaces, exposing features for identification by electron microscopy. (18.2) Freeze-fracture replication Technique in which a tissue sample is first frozen and then struck with a blade that fractures the tissue block along the lines of least resistance, often resulting in a fracture line between the two leaflets of the lipid bilayer; metals are then deposited on the exposed surfaces to create a shadowed replica that is analyzed by electron microscopy. (14.4, 18.2) Functional groups Particular groupings of atoms that tend to act as a unit, often affecting the chemical and physical behavior of the larger organic molecules to which they belong. (2.4) ␥ -tubulin A type of tubulin that plays a critical component in microtubule nucleation. (9.3) G protein See GTP-binding protein. G protein-coupled receptors (GPCRs) A group of related receptors that span the plasma membrane seven times. The binding of the ligand to its specific receptor causes a change in the conformation of the receptor that increases its affinity for a heterotrimeric G protein initiating a response within the cell. (15.3)

G-8 GLOSSARY G1 Period within the cell cycle following mitosis and preceding the initiation of DNA synthesis. (14.1) G2 Period within the cell cycle between the end of DNA synthesis and the beginning of M phase. (14.1) Gametophyte The haploid stage of the life cycle of plants that begins with spores generated during the sporophyte stage. During the gametophyte stage, gametes form through the process of mitosis. (14.3) Gap junctions Sites between animal cells that are specialized for intercellular communication. Plasma membranes of adjacent cells come within about 3 nm of each other, and the gap is spanned by very fine “pipelines,” or connexons that allow the passage of small molecules. (7.5) Gated channel An ion channel that can change conformation between a form open to its solute ion and one closed to the ion; such channels can be voltage gated, chemical gated, or mechanically gated depending on the nature of the process that triggers the change in conformation. (4.7) Gel filtration Purification technique in which separation of proteins (or nucleic acids) is based primarily on molecular mass. The separation material consists of tiny porous beads that are packed into a column through which the solution of protein slowly passes. (18.7) Gene duplication The duplication of a small portion of a single chromosome usually by a process of unequal crossing over. (10.5) Gene regulatory protein A protein that is capable of recognizing a specific sequence of base pairs within the DNA and binding to that sequence with high affinity thereby altering gene expression. (12.4) Gene therapy The process by which a patient is treated by altering the genotype of diseased cells. (HP4) General transcription factors (GTFs) Auxiliary proteins necessary for RNA polymerase to initiate transcription. These factors are termed “general” because the same ones are required for transcription of a diverse array of genes by the polymerase. (11.4) Genes In nonmolecular terms, a unit of inheritance that governs the character of a particular trait. In molecular terms, a segment of DNA containing the information for a single polypeptide or RNA molecule, including transcribed but non-coding regions. (10.1) Genetic code Manner in which the nucleotide sequences of DNA encode the information for making protein products. (11.6) Genetic map Assignment of genetic markers to relative positions on a chromosome based on crossover frequency. (10.5)

Genetic polymorphisms Sites in the genome that vary with relatively high frequency among different individuals in a species population. (10.6) Genetic recombination (crossing over) Reshuffling of the genes on chromosomes (thereby disrupting linkage groups) that occurs as a result of breakage and reunion of segments of homologous chromosomes. (10.1) Genome The complement of genetic information unique to each species of organism. Equivalent to the DNA of a haploid set of chromosomes from that species. (10.4) Germ cells Cells (e.g., spermatogonia, oogonia, spermatocytes, oocytes) that have the capability to give rise to gametes. Globular protein One with a tertiary structure that is compact, resembling a globe. (2.5) Glycocalyx A layer closely applied to the outer surface of the plasma membrane. It contains membrane carbohydrates along with extracellular materials that have been secreted by the cell into the external space, where they remain closely associated with the cell surface. (7.1) Glycogen Highly branched glucose polymer that serves as readily available chemical energy in most animal cells. (2.5) Glycolipids Sphingosine-based lipid molecules linked to carbohydrates, often active components of plasma membranes. (4.3) Glycolysis The first pathway in the catabolism of glucose, it does not require oxygen and results in the formation of pyruvate. (3.3) Glycosaminoglycans (GAGs) A group of highly acidic polysaccharides with the structure of OAOBOAOBO, where A and B represent two different sugars. (2.5) Glycosidic bond The chemical bond that forms between sugar molecules. (2.5) Glycosylation The reactions by which sugar groups are added to proteins and lipids. (4.3, 8.3, 8.4) Glycosyltransferases A large family of enzymes that transfer specific sugars from a specific donor (a nucleotide sugar) to a specific receptor (typically the growing end of an oligosaccharide chain). (8.3) Glyoxysomes Organelles found in plant cells that serve as sites for enzymatic reactions including the conversion of stored fatty acids to carbohydrate. (5.6) Golgi complex Network of smooth membranes organized into a characteristic morphology, consisting of flattened, disc-like cisternae with dilated rims and associated vesicles and tubules. The Golgi complex functions primarily as a processing plant where proteins newly synthesized in the

endoplasmic reticulum are modified in specific ways. (8.4) GPI-anchored proteins Peripheral membrane proteins that are anchored to the membrane via linkage to a glycosylphosphatidylinositol molecule of the bilayer. (4.4) Grana Orderly, stacked arrangement of thylakoids. (6.1) Green fluorescent protein (GFP) A fluorescent protein encoded by the jellyfish Aequoria victoria that is widely used to follow events in living cells. In most cases the gene encoding the protein is fused to the gene of interest and the DNA containing the fusion protein is introduced into the cells to be studied. (8.2) Ground state The unexcited state of an atom or molecule. (6.3) Growth cone The distal tip of a growing neuron that contains the locomotor activity required for axonal extension. (9.7) GTP-binding proteins (or G proteins) With key regulatory roles in many different cellular processes, G proteins can be present in at least two alternate conformations, an active form containing a bound GTP molecule, and an inactive form containing a bound GDP molecule. (8.3) GTPase-activating proteins (GAPs) Proteins that bind to G proteins, activating their GTPase activity. As a result, GAPs shorten the duration of a G protein-mediated response. (15.4) Guanine nucleotide-dissociation inhibitors (GDIs) Proteins that bind to G proteins, inhibiting the dissociation of the bound GDP, thus maintaining the G protein in the inactive state. (15.4) Guanine nucleotide-exchange factors (GEFs) Proteins that bind to a G protein, stimulating the exchange of a bound GDP with a GTP, thereby activating the G protein. (15.4) Guanosine triphosphate (GTP) A nucleotide of great importance in cellular activities. It binds to a variety of proteins (called G proteins) and acts as a switch to turn on their activities. (2.5) Half-life A measure of the instability of a radioisotope, or, equivalently, the amount of time required for one-half of the radioactive material to disintegrate. (18.4) Haploid Containing only one member of each pair of homologous chromosomes. Haploid cells are produced during meiosis, as exemplified by sperm. Contrast with diploid. (10.4, 14.3) Haplotype A block of the genome that tends to be inherited intact from generation to generation. Haplotypes are generally defined by the presence of a consistent combination of single nucleotide polymorphisms (SNPs). (HP10)

GLOSSARY G-9 Head group The polar, water-soluble region of a phospholipid that consists of a phosphate group linked to one of several small, hydrophilic molecules. (4.3) Heat shock response Activation of the expression of a diverse array of genes in response to temperature elevation. The products of these genes, including molecular chaperones, help the organism recover from the damaging effects of elevated temperature. (EP2) Heavy chain One of the two types of polypeptide chains in an antibody, usually with a molecular mass of 50 to 70 kD. (17.4) Helicase A protein that unwinds the DNA (or RNA) duplex in a reaction in which energy released by ATP hydrolysis is used to break the hydrogen bonds that hold the two strands together. (13.1) Hematopoietic stem cells (HSCs) Cells that are situated primarily in the bone marrow that are capable of both self-renewal and of giving rise to all types of blood cells. (HP1, 17.1) Hemicelluloses Branched polysaccharides of the plant cell wall whose backbone consists of one sugar, such as glucose, and sidechains of other sugars, such as xylose. (7.6) Hemidesmosome Specialized adhesive structure at the basal surface of epithelial cells that functions to attach the cells to the underlying basement membrane. The hemidesmosome contains a dense plaque on the inner surface of the plasma membrane, with keratin-containing filaments coursing out into the cytoplasm. (7.2) Hemolysis The permeabilization of red blood cell membranes, experimentally done through placement of cells in hypotonic solution, where they swell before bursting, releasing cellular contents and leaving membrane ghosts. (4.6) Heterochromatin Chromatin that remains compacted during interphase. (12.2) Heterogeneous nuclear RNAs (hnRNAs) A large group of RNA molecules that share the following properties: (1) they heave large molecular weights (up to about 80S, or 50,000 nucleotides); (2) they represent many different nucleotide sequences; and (3) they are found only in the nucleus. Includes pre-mRNAs. (11.4) Heterotrimeric G protein A component of certain signal transduction systems, referred to as G proteins because they bind guanine nucleotides (either GDP or GTP), and described as heterotrimeric because all of them consist of three different polypeptide subunits. (15.3) Heterotroph Organism that depends on an external source of organic compounds. (6.1) High-performance liquid chromatography (HPLC) A high-resolution type of chromatography in which long, narrow

columns are used, and the mobile phase is forced through a tightly packed matrix under high pressure. (18.7) Highly repeated fraction Typically short (a few hundred nucleotides at their longest) DNA sequences that are present in at least 105 copies per genome. The highly repeated sequences typically account for about 10 percent of the DNA of vertebrates. (10.4) Histone acetyltransferases (HATs) Enzymes that transfer acetyl groups to lysine and arginine residues of core histones. Histone acetylation is associated with activation of transcription. (12.2, 12.4) Histone code A concept that the state and activity of a particular region of chromatin depends upon the specific covalent modifications, or combination of modifications, to the histone tails of the nucleosomes in that region. The modifications are created by enzymes that acetylate, methylate, and phosphorylate various amino acid residues within the core histones. (12.2) Histone deacetylases (HDACs) Enzymes that catalyze the removal of acetyl groups from core histones. Histone deacetylation is associated with the repression of transcription. (12.4) Histones A collection of small, well-defined, basic proteins of chromatin. (12.2) hnRNP (heterogeneous nuclear ribonucleoprotein) The result of the transcription of each hnRNA which becomes associated with a variety of proteins; hnRNP represents the substrate for the processing reactions that follow. (11.4) Homogenize To mechanically rupture cells. (8.2) Homologous chromosomes Paired chromosomes of diploid cells, each carrying one of the two copies of the genetic material carried by that chromosome. (10.1) Homologous sequences When the amino acid sequences of two or more polypeptides, or the nucleotide sequences of two or more genes, are similar to one another, they are presumed to have evolved from the same ancestral sequence. Such sequences are said to be homologous, a term reflecting an evolutionary relatedness. (2.4) Humoral immunity Immunity mediated by blood-borne antibodies. (17.1) Hybridomas Hybrid cells produced by fusion of a normal antibody-producing lymphocyte and a malignant myeloma cell. Hybridomas proliferate, producing large amounts of a single (monoclonal) antibody that was being synthesized by the normal cell prior to fusion with the myeloma cell. (18.18) Hydrocarbons The simplest group of organic molecules, consisting solely of carbon and hydrogen. (2.4)

Hydrogen bond The weak, attractive interaction between a hydrogen atom covalently bonded to an electronegative atom (thus, with a partial positive charge) and a second electronegative atom. (2.2) Hydrophilic The tendency of polar molecules to interact with surrounding water molecules, which are also polar; derived from “water loving.” (2.2) Hydrophobic interaction The tendency of nonpolar molecules to aggregate so as to minimize their collective interaction with surrounding polar water molecules; derived from “water fearing.” (2.2) Hypertonic (or Hyperosmotic) Property of one compartment having a higher solute concentration compared with that in a given compartment. (4.7) Hypervariable regions The subsections of antibody variable regions that vary more significantly in sequence from one molecule to another and are associated with antigen specificity. (17.4) Hypotonic (or Hypoosmotic) Property of one compartment having a lower solute concentration compared with that in a given compartment. (4.7) Immune responses Responses elicited by cells of the immune system upon contact with foreign materials, including invading pathogens. Includes innate and adaptive responses. Adaptive immune responses can be divided into primary responses that follow initial exposure to an antigen and secondary responses that follow reexposure to that antigen. (17.1) Immune system Physiological system consisting of organs, scattered tissues, and independent cells that protect the body from invading pathogens and foreign materials. (17) Immunity State in which the body is not susceptible to infection by a particular pathogen. (17) Immunoglobulin superfamily (IgSF) A wide variety of proteins that contain domains composed of 70 to 110 amino acids that are homologous to domains that make up the polypeptide chains of blood-borne antibodies. (7.3, 17.1) Immunologic tolerance A state in which the body does not react against a particular substance, such as the body’s own proteins, because cells that could engage in such a response have either been inactivated or destroyed. (17.1, HP 17) Immunotherapy Treatment of disease, including cancer and autoimmune disorders, with antibodies or immune cells. (16.4, HP17) Imprinting Differential expression of genes depending solely on whether they were contributed to the zygote by the sperm or the egg. (12.4)

G-10 GLOSSARY In situ hybridization Technique to localize a particular DNA or RNA sequence in a cell or on a culture plate or electrophoretic gel. (10.3, 18.10) Indirect immunofluorescence A variation of direct immunofluorescence in which the cells are treated with unlabeled antibody, which is allowed to form a complex with the corresponding antigen. The location of the antigen-antibody couple is then revealed in a second step using a preparation of fluorescently labeled antibodies whose combining sites are directed against the antibody molecules used in the first step. (18.18) Induced fit The conformational change in an enzyme after the substrate has been bound that allows the chemical reaction to proceed. (3.2) Inducer A compound that binds to a protein repressor and activates transcription of a bacterial operon. (12.1) Inducible operon An operon in which the presence of the key metabolic substance induces transcription of the structural genes. (12.1) Inflammation The localized accumulation of fluid and leukocytes in response to injury or infection, leading to redness, swelling, and fever. (17.1) Initiation codon The triplet AUG, the site to which the ribosome attaches to the mRNA to assure that the ribosome is in the proper reading frame to correctly read the entire message. (11.8) Initiation factors Soluble proteins (IFs in bacteria and elFs in eukaryotes) that make initiation of translation possible. (11.8) Innate immune response A nonspecific response to a pathogen that does not require previous exposure to that agent, includes responses mediated by NK cells, complement, phagocytes, and interferon. (17.1) Insulators Specialized boundary sequences that “cordon off ” a promoter and its enhancer from other promoter/enhancer elements. According to one model, insulator sequences bind to proteins of the nuclear matrix. (12.4) Insulin receptor substrates (IRSs) Protein substrates that, when phosphorylated in response to insulin, bind and activate a variety of “downstream” effectors. (15.4) Integral protein A membrane-associated protein that penetrates or spans the lipid bilayer. (4.4) Integrins A superfamily of integral membrane proteins that bind specifically to extracellular molecules. (7.2) Intermediate filaments (IFs) Strong, ropelike cytoskeletal fibers approximately 10 nm in diameter that, depending on the cell type,

may be composed of a variety of protein subunits capable of assembling into similar types of filaments. IFs are thought to provide mechanical stability to cells and provide specialized, tissue-specific functions. (9.4) Intermembrane space The space between the inner and outer mitochondrial membranes. (5.1) Interphase The portion of the cell cycle between periods of cell division. (14.1) Intervening sequences Regions of DNA that are between coding sequences of a gene and that are therefore missing from corresponding mRNA. (11.4) Intraflagellar transport (IFT) A process in which particles are moved in both directions between the base of a flagellum or cilium and its tip. The force that drives IFT is generated by motor proteins that track along the peripheral doublets of the axoneme. (9.3) Introns Those parts of a split gene that correspond to the intervening sequences. (11.4) Inversion A chromosomal aberration that results after a chromosome is broken in two places and the resulting center segment is reincorporated into the chromosome in reverse order. (HP12) In vitro Outside the body. Cells grown in culture are said to be grown in vitro; and studies on cultured cells are an essential tool of cell and molecular biologists. (1.2) Ion An atom or molecule with a net positive or negative charge because it has lost or gained one or more electrons during a chemical reaction. (2.1) Ion channel A transmembranous structure (e.g., an integral protein with an aqueous pore) permeable to a specific ion or ions. (4.7) Ion-exchange chromatography A technique for protein purification in which ionic charge is used to separate different proteins. (18.7) Ionic bond A noncovalent bond occurring between oppositely charged ions, also called a salt bridge. (2.2) Iron-sulfur proteins A group of protein electron carriers with an inorganic, iron-sulfur center. (5.3) Irreversible inhibitor An enzyme inhibitor that binds tightly, often covalently, thus inactivating the enzyme molecule permanently. (3.2) Isoelectric focusing A type of electrophoresis in which proteins are separated according to isoelectric point. (18.7) Isoelectric point The pH at which the negative charges of the component amino acids of a protein equal the positive charges of the component amino acids, so the protein is neutral. (18.7) Isoforms Different versions of a protein. Isoforms may be encoded by separate,

closely related genes, or formed as splice variants by alternative splicing from a single gene. (2.5) Isotonic Property of one compartment having the same solute concentration compared with that in a given compartment. (4.7) Karyotype Image in which paired homologous chromosomes are ordered in decreasing size. (12.2) Kinesin A plus end-directed motor protein that moves membranous vesicles and other organelles along microtubules through the cytoplasm. Kinesin is a member of a family of kinesin-related proteins (KRPs). (9.3) Kinetic energy Energy released from a substance through atomic or molecular movements. (3.1) Kinetochore A buttonlike structure situated at the outer surface of the centromere to which the microtubules of the spindle attach. (14.2) Knockout mice Mice, born as the result of a series of experimental procedures, that are lacking a functional gene that they would normally contain. (18.17) Lagging strand The newly synthesized daughter DNA strand that is synthesized discontinuously, so called because the initiation of each fragment must wait for the parental strands to separate and expose additional template. (13.1) Lamellipodium The leading edge of a moving fibroblast, which is extended out from the cell as a broad, flattened, veil-like projection that glides over the substratum. (9.7) Lateral gene transfer Transfer of genes from one species to another. (EP1) Leading strand The newly synthesized daughter DNA strand that is synthesized continuously, so called because its synthesis continues as the replication fork advances. (13.1) Ligand Any molecule that can bind to a receptor because it has a complementary structure. (4.1, 15.1) Light chain Smaller of the two types of polypeptide chains in an antibody, with a molecular mass of 23 kDa. (17.4) Light-harvesting complex II (LHCII) Pigment-protein complex, located outside the photosystem itself, that contains most of the antenna pigments that collect light for PSII. It can also be associated with PSI. (6.4) Light-dependent reactions First of two series of reactions that compose photosynthesis. In these reactions, energy from sunlight is absorbed and converted to chemical energy that is stored in ATP and NADPH. (6.2)

GLOSSARY G-11 Light-independent reactions (dark reactions) Second of two series of reactions that compose photosynthesis. In these reactions, carbohydrates are synthesized from carbon dioxide using the energy stored in the ATP and NADPH molecules formed in the light-dependent reactions. (6.2) Limit of resolution The resolution attainable by a microscope is limited by the wavelength of the illumination according to the equation D ⫽ 0.61 ␭/n sin ␣, where D is the minimum distance that two points in the specimen must be separated to be resolved, ␭ is the wavelength of light, and n is the refractive index of the medium. Alpha is a measure of the light-gathering ability of the lens and is directly related to its aperture. For a light microscope, the limit of resolution is slightly less than 200 nm. (18.1) Linkage groups Groups of genes that reside on the same chromosome causing nonindependent segregation of traits controlled by these genes. (10.1) Lipid-anchored protein A membrane-associated protein that is located outside the bilayer but is covalently linked to a lipid molecule within the bilayer. (4.4) Lipid bilayer Phospholipids self-assembled into a bimolecular structure based on hydrophobic and hydrophilic interactions; biologically important as the core organization of cellular membranes. (4.2) Lipid rafts Microdomains within a cellular membrane that possess decreased fluidity due to the presence of cholesterol, glycolipids, and phospholipids containing longer, saturated fatty acids. A proposed residence of GPI-anchored proteins and signaling proteins. (4.5) Lipids Nonpolar organic molecules, including fats, steroids, and phospholipids, whose common property of not dissolving in water contributes to much of their biological activity. (2.5) Liposome An artificial lipid bilayer that selfassembles into a spherical vesicle or vesicles when in an aqueous environment. (4.3) Locus (pl. loci) The position of a gene on a chromosome. (10.1) Luminal (cisternal) space The region of fluid content of the cytoplasm enclosed by the membranes of the endoplasmic reticulum or Golgi complex. (8.3) Lymphocytes Nucleated leukocytes (white blood cells) that circulate between the blood and lymph organs that mediate acquired immunity. Includes both B cells and T cells. (17) Lysosomal storage disorders Diseases characterized by the deficiency of a lysosomal enzyme and the corresponding accumulation of undegraded substrate. (HP8)

M phase The part of the cell cycle that includes the processes of mitosis, during which duplicated chromosomes are separated into two nuclei, and cytokinesis, during which the entire cell is physically divided into two daughter cells. (14.1) Macromolecules Large, highly organized molecules crucial to the structure and function of cells; divided into polysaccharides, certain lipids, proteins, and nucleic acids. (2.4) Major histocompatibility complex (MHC) A region of the genome that encodes MHC proteins. The genes that encode these proteins tend to be highly polymorphic, being represented by a large number of different alleles. These genetic differences between humans account for the tendency of a person to reject a transplant from another person other than an identical twin. (17.4) Mass spectrometry Methodology to identify molecules (including proteins). A protein or mixture of proteins is fragmented, converted into gaseous ions, and propelled through a tubular component of a mass spectrometer, causing the ions to separate according to their mass/charge (m/z) ratio. Identification of the protein(s) is made by comparison with a computer database of the sequence of proteins encoded by a particular genome. (2.5, 18.7) Matrix One of two aqueous compartments of a mitochondrion; the matrix is located within the interior of the organelle; the second compartment is called the intermembrane space and is located between the outer and inner mitochondrial membrane. (5.1) Matrix metalloproteinases (MMPs) A family of zinc-containing enzymes that act in the extracellular space to digest various extracellular proteins and proteoglycans. (7.1) Maximal velocity (Vmax ) The highest rate achieved for a given enzymatically catalyzed reaction, it occurs when the enzyme is saturated with substrate. (3.2) Medial cisternae The cisternae of the Golgi complex between the cis and trans cisternae. (8.4) Meiosis The process during which the chromosome number is reduced so that cells are formed that contain only one member of each pair of homologous chromosomes. (14.3) Membrane fluidity A property of the physical state of the lipid bilayer of a membrane that allows diffusion of membrane lipids and proteins within the plane of the membrane. Inversely related to membrane viscosity. Membrane fluidity increases as the temperature rises and in bilayers with more unsaturated lipids. (4.5) Membrane potential The electrical potential difference across a membrane. (4.8)

Messenger RNA (mRNA) The intermediate molecule between a gene and the polypeptide for which it codes. Messenger RNA is assembled as a complementary copy of one of the two DNA strands that encodes the gene. (11.1) Metabolic intermediate A compound produced during one step of a metabolic pathway. (2.4) Metabolic pathway A series of chemical reactions that results in the synthesis of an end product important to cellular function. (2.4) Metabolism The total of the chemical reactions occurring within a cell. (1.2) Metaphase The stage of mitosis during which all of the chromosomes have become aligned at the spindle equator, with one chromatid of each chromosome connected to one pole and its sister chromatid connected to the opposite pole. (14.2) Metastasis Spread of cancer cells from a primary tumor to distant sites in the body where the formation of secondary tumors may arise. (16.3) Methylguanosine cap Modification of the 5⬘ end of an mRNA precursor molecule, so that the terminal “inverted” guanosine is methylated at the 7⬘ position on its guanine base, while the nucleotide on the internal side of the triphosphate bridge is methylated at the 2⬘ position of the ribose. This cap prevents the 5⬘ end of the mRNA from being digested by nucleases, aids in transport of the mRNA out of the nucleus, and plays a role in the initiation of mRNA translation. (11.4) MHC proteins Proteins encoded by the MHC region of the genome that bind processed antigens (antigenic peptides) and display them on the surface of the cell. Divided into two major classes, MHC class I molecules produced by virtually all cells of the body, and MHC class II molecules produced by “professional” APCs, such as macrophages and dendritic cells. (17.4) Michaelis constant (KM) In enzyme kinetics, the value equal to the substrate concentration present when reaction rate is one-half of the maximal velocity. (3.2) Micrometer Measure of length equaling 10⫺6 meters. (1.3) MicroRNAs (miRNAs) Small RNAs (20-23 nucleotides long) that are synthesized from many sites in the genome and involved in inhibiting translation or increasing degradation of complementary mRNAs. (11.5) Microfibrils Bundles of cellulose molecules that confer rigidity to the cell wall and provide resistance to pulling forces. (7.6) Microfilaments Solid, 8-nm thick, cytoskeletal structures composed of a double-helical

G-12 GLOSSARY polymer of the protein actin. They play a key role in virtually all types of contractility and motility within cells. (9, 9.5) Microscope An instrument that provides a magnified image of a tiny object. (1.1) Microsomes A heterogeneous collection of vesicles formed from the endomembrane system (primarily the endoplasmic reticulum and Golgi complex) after homogenization. (8.2) Microtubule-organizing centers (MTOCs) A variety of specialized structures that exert a role in initiating microtubule formation. (9.3) Microtubule-associated proteins (MAPs) Proteins other than tubulin contained in microtubules obtained from cells. MAPs may interconnect microtubules to form bundles and can be seen as cross-bridges connecting microtubules to each other. Other MAPs may increase the stability of microtubules, alter their rigidity, or influence the rate of their assembly. (9.3) Microtubules Hollow, cylindrical cytoskeletal structures, 25 nm in diameter, whose wall is composed of the protein tubulin. Microtubules are polymers assembled from ␣␤-tubulin heterodimers that are arranged in rows, or protofilaments. Because of their rigidity, microtubules often act in a supportive capacity. (9, 9.3) Mismatch repair DNA repair system that removes mismatched bases that are incorporated by the DNA polymerase and escape the enzyme’s proofreading exonuclease. (13.2) Mitochondrial matrix The aqueous compartment within the interior of a mitochondrion. (5.1) Mitochondrial membranes The outer membrane serves as a boundary with the cytoplasm and is relatively permeable, and the inner membrane houses respiratory machinery in its many invaginations and is highly impermeable. (5.1) Mitochondrion The cellular organelle in which aerobic energy transduction takes place, oxidizing metabolic intermediates such as pyruvate to produce ATP. (5.1) Mitosis Process of nuclear division in which duplicated chromosomes are faithfully separated from one another, producing two nuclei, each with a complete copy of all the chromosomes present in the original cell. (14.2) Mitotic spindle Microtubule-containing “machine” that functions in the organization and sorting of duplicated chromosomes during mitotic cell division. (14.2) Model organisms Organisms that have been widely used for research so that a great deal is known about their biology. These organisms have properties that have made

them excellent research subjects. Such organisms include the bacterium, E. coli; the budding yeast, S. cerevisiae; the nematode, C. elegans; the fruit fly, D. melanogaster; the mustard plant, A. thaliana; and the mouse, M. musculus. (1.3) Moderately repeated fraction DNA sequences that are repeated from a few to several hundred thousand times within a eukayotic genome. The moderately repeated fraction of the DNA can vary from about 20 to about 80 percent of the total DNA. These sequences may be identical to each other or nonidentical but related. (10.4) Molecular chaperones Various families of proteins whose role is to assist the folding and assembly of proteins by preventing undesirable interactions. (EP2) Monoclonal antibody A preparation of antibody molecules produced from a single colony (or clone) of cells. (18.18) Monosomy A chromosome complement that lacks one chromosome, i.e., has only one member of one of the pairs of homologous chromosomes. (HP14) Motif A substructure found among many different proteins, such as the ␣␤ barrel, which consists of ␤ strands connected by an ␣-helical region. (2.5) Motor proteins Proteins that utilize the energy of ATP hydrolysis to generate mechanical forces that propel the protein, as well as attached cargo, along one of the components of the cytoskeleton. Three families of motor proteins are known: kinesins and dyneins move along microtubules and myosins move along microfilaments. (9.3) Multiprotein complex The interaction of more than one complete protein to form a larger, functional complex. (2.5) Muscle fiber A skeletal muscle cell, referred to as a muscle fiber because of its highly ordered, multinucleated, cablelike structure composed of hundreds of thinner, cylindrical myofibrils. (9.6) Mutant An individual having an inheritable characteristic that distinguishes it from the wild type. (10.1) Mutation A spontaneous change in a gene that alters it in a permanent fashion so that it causes heritable change. (10.1) Myelin sheath The lipid-rich material wrapped around most neurons in the vertebrate body. (4.8) Myofibrils The thin, cylindrical strands found within muscle fibers. Each myofibril is composed of repeating linear arrays of contractile units, called sarcomeres, that give skeletal muscle cells their striated appearance. (9.6) Myosin A large family of motor proteins that moves along actin-containing microfilaments. Most myosins are plus-end

directed motors. Conventional myosin (myosin II) is the protein that mediates muscle contractility as well as certain types of nonmuscle motility, such as cytokinesis. Unconventional myosins (I and III-XVIII) have many diverse roles including organelle transport. (9.5) Nanometer Measure of length equaling 10⫺9 meters. (1.3) nanotechnology A field of engineering involving the development of tiny “nanomachines” capable of performing specific activities in a submicroscopic world. (9.1) Nascent protein A protein in the process of being synthesized, i.e., not yet complete. (11.8) Natural killer (NK) cell A type of lymphocyte that engages in a nonspecific attack on an infected host cell, leading to apoptosis. (17.1) Negative staining Procedures in which heavy-metal deposits are collected everywhere on the specimen grid except in the locations of very small particulate materials, including high-molecular-weight aggregates such as viruses, ribosomes, multisubunit enzymes, cytoskeletal elements, and protein complexes. (18.2) Nerve impulse The process through which an action potential is propagated along the membrane of a neuron by sequentially triggering action potentials in adjacent stretches of membrane. (4.8) Neurofilaments Loosely packed bundles of intermediate filaments located within the cytoplasm of neurons. Neurofilaments have long axes that are oriented parallel to that of the nerve cell axon and are composed of three distinct proteins: NF-L, NF-H, and NF-M. (9.4) Neuromuscular junction The point of contact of a terminus of an axon with a muscle fiber, the neuromuscular junction is a site of transmission of nerve impulses from the axon across the synaptic cleft to the muscle fiber. (4.8, 9.6) Neurotransmitter A chemical that is released from a presynaptic terminal and binds to the postsynaptic target cell, altering the membrane potential of the target cell. (4.8) Nitrogen fixation The process through which nitrogen gas is chemically reduced and converted into a component of organic compounds. (1.3) Noncompetitive inhibitor An enzyme inhibitor that does not bind at the same site as the substrate, and so the level of inhibition depends only on the concentration of inhibitor. (3.2) Noncovalent bond A relatively weak chemical bond based on attractive forces between

GLOSSARY G-13 oppositely charged regions within a molecule or between two nearby molecules. (2.2) Noncyclic photophosphorylation The formation of ATP during the process of oxygen-releasing photosynthesis in which electrons move in a linear path from H2O to NADP⫹. (6.5) Nonpolar molecules Molecules whose covalent bonds have a nearly symmetric distribution of charge because the component atoms have approximately the same electronegativities. (2.1) Nonrepeated fraction Those DNA sequences in the genome that are present in only one copy per haploid set of chromosomes. These sequences contain the greatest amount of genetic information including the codes for virtually all proteins other than histones. (10.4) Nonsense mutations Mutations that produce stop codons within genes, thereby causing premature termination of the encoded polypeptide chain. (11.8) Nonsense-mediated decay (NMD) An mRNA surveillance mechanism that detects mRNAs containing premature termination (nonsense) codons, leading to their destruction. (11.8) Nontranscribed spacer The region of a gene cluster that is not transcribed. Nontranscribed spacers are present between various types of tandemly repeated genes, including those of tRNAs, rRNAs, and histones. (11.3) Nuclear envelope The complex, doublemembrane structure that divides the eukaryotic nucleus from its cytoplasm. (12.2) Nuclear envelope breakdown The disassembly of the nuclear envelope at the end of prophase. (14.2) Nuclear lamina A thin meshwork composed of intermediate filaments that lines the inner surface of the nuclear envelope. (12.2) Nuclear localization signal (NLS) Sequence of amino acids in a protein that is recognized by a transport receptor leading to translocation of the protein from the cytoplasm to the nucleus. (12.2) Nuclear pore complex (NPC) Complex, basketlike apparatus that fills the nuclear pore like a stopper, projecting outward into both the cytoplasm and the nucleoplasm. (12.2) Nucleic acid Polymers composed of nucleotides, which in living organisms are based on one of two sugars, ribose or deoxyribose, yielding the terms ribonucleic acid (RNA) and deoxyribonucleic acid (DNA). (2.5) Nucleic acid hybridization A variety of related techniques that are based on the fact that two single-stranded nucleic

acid molecules of complementary base sequence will form a double-stranded hybrid. (18.10) Nucleoid The poorly defined region of a prokaryotic cell that contains its genetic material. (1.3) Nucleoli (sing. nucleolus) Irregular-shaped nuclear structures that function as ribosome-producing organelles. (11.3) Nucleosomes Repeating subunits of chromatin. Each nucleosome contains a nucleosome core particle, consisting of 146 base pairs of supercoiled DNA wrapped almost twice around a disc-shaped complex of eight histone molecules. The nucleosome core particles are connected to one another by a stretch of linker DNA. (12.2) Nucleotide The monomer of nucleic acids, each consists of three parts: a sugar (either ribose or deoxyribose), a phosphate group, and a nitrogenous base, with the phosphate linked to the sugar at the 5⬘ carbon and the base at the 1⬘ carbon. (2.5, 10.3) Nucleotide excision repair (NER) A cut-and-patch mechanism for the removal from the DNA of a variety of bulky lesions, e.g., pyrimidine dimers, caused by ultraviolet radiation. (13.2) Nucleus The organelle that contains a eukaryotic cell’s genetic material. (1.3) Objective lens The lens of a light microscope that focuses light rays from the specimen to form a real, enlarged image of the object within the column of the microscope. (18.1) Oils Fats that are liquid at room temperature. (2.5) Okazaki fragments Small segments of DNA that are rapidly linked to longer pieces that have been synthesized previously, to form the lagging strand. (13.1) Oligosaccharides Small chains composed of sugars covalently attached to lipids and proteins; they distinguish one type of cell from another and help mediate interactions of a cell with its surroundings. (2.5) Oncogenes Genes that encode proteins that promote the loss of growth control and the conversion of the cell to a malignant state. These genes have the ability to transform cells. (16.3) Operator Binding site for bacterial repressors that is situated between the polymerase binding site and the first structural gene. (12.1) Operon A functional complex on a bacterial chromosome comprising a cluster of genes including structural genes, a promoter region, an operator region, and a regulatory gene. (12.1) Organelles The organizationally and functionally diverse, membranous or membrane-bounded, intracellular

structures that are the defining feature of eukaryotic cells. (1.3) Origin of replication The specific site on the bacterial chromosome where replication begins. (13.1) Osmosis The property of water passing through a semipermeable membrane from a region of lower solute concentration to one of higher solute concentration, with the tendency of eventually equalizing solute concentration in the two compartments. (4.7) Oxidation The process through which an atom loses one or more electrons to another atom, in which the atom gaining electrons is considered to be reduced. (3.3) Oxidation-reduction (redox) potential The separation of charge, measured in voltage, for any given pair of oxidizing-reducing agents, such as NAD⫹ and NADH, relative to a standard couple (e.g., H⫹ and H2). (5.3) Oxidation-reduction (redox) reaction One in which a change in the electronic state of the reactants occurs. (3.3) Oxidative phosphorylation ATP formation driven by energy derived from high-energy electrons removed during substrate oxidation in pathways such as the TCA cycle, with the energy released for ATP formation by passage of the electrons through the electron-transport chain in the mitochondrion. (5.3) Oxidizing agent The substance in a redox reaction that becomes reduced, causing the other substance to become oxidized. (3.3) P (peptidyl) site Site in the ribosome from which the tRNA donates amino acids to the growing polypeptide chain. (11.8) P680 The reaction center of photosystem II. “P” stands for pigment and “680” is the wavelength of light that this molecule absorbs most strongly. (6.4) P700 The reaction center of photosystem I. “P” stands for pigment, and “700” is the wavelength of light that this molecule absorbs most strongly. (6.4) Partition coefficient The ratio of a solute’s solubility in oil to that in water, it is a measure of the relative polarity of a biological substance. (4.7) Patch clamping A technique to study ion movement through ion channels that is accomplished by clamping, or maintaining voltage, across a patch of membrane by sealing a micropipette electrode to the surface and then measuring the current across that portion of membrane. (4.5) Pathogen Any agent capable of causing infection or disease in a cell or organism. (2.5) Pectins A heterogeneous class of negatively charged polysaccharides that make up the matrix of the plant cell wall. Pectins hold water and form a gel that fills in the spaces between the fibrous elements. (7.6)

G-14 GLOSSARY Penetrance The probability that a given mutation will result in disease. (HP10) Peptide bond The chemical bond linking amino acids in a protein, which forms when the carboxyl group of one amino acid reacts with the amino group of a second amino acid. (2.5) Peptidyl transferase That portion of the large ribosomal subunit that is responsible for catalyzing peptide bond formation; peptidyl transferase activity resides in the large ribosomal RNA molecule. (11.8) Pericentriolar material (PCM) Amorphous, electron-dense material that surrounds the centrioles in an animal cell. (9.3) Peripheral protein A membrane-associated protein that is located entirely outside of the lipid bilayer and interacts with it through noncovalent bonds. (4.4) Peroxisomes (microbodies) Simple, membrane-bound, multi-functional organelles of the cytoplasm that carry out a diverse array of metabolic reactions, including substrate oxidation leading to formation of hydrogen peroxide. For example, peroxisomes are the site of oxidation of very-long-chain fatty acids, the oxidation of uric acid, and the synthesis of plasmalogens. Plant glyoxysomes, which carry out the glyoxylate cycle, are a type of peroxisome. (5.6) pH The standard measure of relative acidity, it mathematically equals ⫺log[H⫹]. (2.3) PH domain A protein domain that binds to the phosphorylated inositol rings of membrane-bound phosphoinositides. (15.2) Phagocytosis Process by which particulate materials are taken into cells. Materials are enclosed within a fold of the plasma membrane, which buds into the cytoplasm to form a vesicle called a phagosome. (8.8) Phase-contrast microscope A microscope that converts differences in refractive index into differences in intensity (relative brightness and darkness), which are then visible to the eye, making highly transparent objects more visible. (18.1) Phosphatidylinositol 3-hydroxy kinase [or PI3K] One of the best-studied effectors containing an SH2 domain. The products of the enzyme serve as inositol-containing cellular messengers that have diverse functions in cells. (15.4) Phosphoglycerides The name given to membrane phospholipids that are built on a glycerol backbone. (4.2) Phosphoinositides Includes a number of phosphorylated phosphatidylinositol derivatives (e.g., PIPs, PIP2s, and PIP3) that serve as second messengers in signaling pathways. (15.3) Phospholipase C An enzyme that catalyzes a reaction that splits PIP2 into two molecules: inositol 1,4,5,-trisphosphate (IP3) and

diacylglycerol (DAG), both of which play important roles as second messengers in cell signaling. (15.3) Phospholipid-transfer proteins Proteins whose function is to transport specific phospholipids through the aqueous cytosol from one type of membrane compartment to another. (8.3) Phospholipids Phosphate-containing lipids that represent the primary constituents of the lipid bilayer of cellular membranes. Phospholipids include both phosphoglycerides and sphingomyelin. (4.3) Photoautotroph An autotroph that utilizes the radiant energy of the sun to convert CO2 into organic compounds. (6) Photolysis The splitting of water during photosynthesis. (6.4) Photon Packet of light energy. The shorter the wavelength, the greater the energy of the photons. (6.3) Photorespiration A series of reactions in which O2 is attached to RuBP, and eventually resulting in the release of recently fixed CO2 from the plant. (6.6) Photosynthesis The pathway converting the energy of sunlight into chemical energy that is usable by living organisms. (6) Photosynthetic unit A group of several hundred chlorophyll molecules acting together to trap photons and transfer energy to the pigment molecule at the reaction center. (6.4) Photosystem I (PSI) One of two spatially separated pigment complexes, which are necessary to boost the energy of a pair of electrons sufficiently to remove them from a molecule of water and transfer them to NADP⫹. Photosystem I raises electrons from an energy level at about the midway point to an energy level above NADP⫹. (6.4) Photosystem II (PSII) One of two spatially separated pigment complexes, which are necessary to boost the energy of a pair of electrons sufficiently to remove them from a molecule of water and transfer them to NADP⫹. Photosystem II boosts electrons from an energy level below that of water at the bottom of the energy trough to about the midway point. (6.4) Phragmoplast Dense material roughly aligned in the equatorial plane of the previous metaphase plate in plant cells, consisting of clusters of interdigitating microtubules oriented perpendicular to the future cell plate, together with vesicles and associated electron-dense material. (14.3) Pigments Molecules that contain a chromophore, a chemical group capable of absorbing light of particular wavelength(s) within the visible spectrum. (6.3) PiwiRNAs (piRNAs) Small RNAs (24-32 bases) that are encoded by a small number

of large genomic loci and act to suppress the movement of transposable elements in germ cells. piRNAs are derived from single-stranded precursors and do not require Dicer for processing. (11.5) Plasma cells Terminally differentiated cells that develop from B lymphocytes that synthesize and secrete large amounts of blood-borne antibodies. (17.2) Plasma membrane The membrane serving as a boundary between the interior of a cell and its extracellular environment. (4.1) Plasmodesmata Cytoplasmic channels, 30 to 60 nm in diameter, that connect most plant cells and extend between adjacent cells directly through the cell wall. Plasmodesmata are lined with plasma membrane and usually contain a dense central structure, the desmotubule, derived from the endoplasmic reticulum of the two cells. (7.5) Plasmolysis The shrinkage that occurs when a plant cell is placed into a hypertonic medium; its volume shrinks as the plasma membrane pulls away from the surrounding cell wall. (4.7) Polar molecules Molecules with an uneven distribution of charge because the component atoms of various bonds have greatly different electronegativities. (2.1) Poly(A) tail A string of adenosine residues at the 3⬘ end of an mRNA added posttranscriptionally. (11.4) Polyacrylamide gel electrophoresis (PAGE) Protein fractionation technique in which the proteins are driven by an applied current through a gel composed of a small organic molecule (acrylamide) that is cross-linked to form a molecular sieve. (18.7) Polymerase chain reaction (PCR) A technique in which a single region of DNA which may be present in vanishingly small amounts, can be amplified cheaply and rapidly. (18.13) Polypeptide chain A long, continuous unbranched polymer formed by amino acids joined to one another by covalent peptide bonds. (2.5) Polyploidization (whole-genome duplication) Phenomenon in which offspring have twice the number of chromosomes in each cell as their diploid parents. Can be an important step in the evolution of a new species. (10.5) Polyribosome (polysome) The complex formed by an mRNA and a number of ribosomes in the process of translating that mRNA. (11.8) Polysaccharide A polymer of sugar units joined by glycosidic bonds. (2.5) Polytene chromosomes Giant chromosomes of insects that contain perfectly aligned, duplicated DNA strands, with as many as 1,024 times the number of DNA strands of normal chromosomes. (10.1)

GLOSSARY G-15 Porins Integral proteins found in bacterial, mitochondrial and chloroplast outer membranes that act as large, relatively nonselective channels. (5.1) Posttranslational modifications (PTMs) Alterations to the side chains of the 20 basic amino acids after their incorporation into a polypeptide chain. (2.5) Potential difference The difference in charge between two compartments, often measured as voltage across the separating membrane. (4.7) Potential energy Stored energy that can be used to perform work. (3.1) Preinitiation complex The assembled association of general transcription factors and RNA polymerase, required before transcription of the gene can be initiated. (11.4) Pre-RNA An RNA molecule that has not yet been processed into its final mature form (e.g., a pre-mRNA, pre-rRNA, or pre-tRNA). (11.4) Primary cilium A single nonmotile cilium present on many types of cells in vertebrates and thought to have a sensory function. (HP9) Primary culture Culturing of cells obtained directly from the organism. (18.5) Primary electron acceptor Molecule that receives the photoexcited electron from reaction-center pigments in both photosystems. (6.4) Primary structure The linear sequence of amino acids within a polypeptide chain. (2.5) Primary transcript (or pre-RNA) The initial RNA molecule synthesized from DNA, which is equivalent in length to the DNA from which it was transcribed. Primary transcripts typically have a fleeting existence, being processed into smaller, functional RNAs by a series of “cut-and-paste” reactions. (11.2) Primary cell walls The walls of a growing plant cell. They allow for extensibility. (7.6) Primase Type of RNA polymerase that assembles the short RNA primers that begin the synthesis of each Okazaki fragment of the lagging strand. (13.1) Primer The DNA or RNA strand that provides DNA polymerase with the necessary 3⬘ OH terminus. (13.1) Prion An infectious agent associated with certain mammalian neurodegenerative diseases that is composed solely of protein. (HP2) Processing-level control Regulation of the path by which a primary RNA transcript is processed into a messenger RNA that can be translated into a polypeptide. (12.4) Processive A term applied to proteins (e.g., kinesin or RNA polymerase) that are capable of moving considerable distances along their track or template (e.g., a microtubule or a

DNA molecule) without dissociating from it. (9.3, 11.2) Prokaryotic cells Structurally simple cells, including archaea and bacteria that do not have membrane-bounded organelles; derived from pro-karyon, or “before the nucleus.” (1.3) Prometaphase The phase of mitosis during which the definitive mitotic spindle is formed and the chromosomes are moved into position at the center of the cell. (14.2) Promoter The site on the DNA to which an RNA polymerase molecule binds prior to initiating transcription. The promoter contains information that determines which of the two DNA strands is transcribed and the site at which transcription begins. (11.2, 12.1) Prophase The first stage of mitosis during which the duplicated chromosomes are prepared for segregation and the mitotic machinery is assembled. (14.2) Proplastids Nonpigmented precursors of chloroplasts. (6.1) Prosthetic group A portion of a protein that is not composed of amino acids, such as the heme group within hemoglobin and myoglobin. (2.5) Proteasome Barrel-shaped, multiprotein complex in which cytoplasmic proteins are degraded. Proteins selected for destruction are linked to ubiquitin molecules and threaded into the central chamber of the proteasome. (12.7) Protein kinase An enzyme that transfers phosphate groups to other proteins, often having the effect of regulating the activity of the other proteins. (3.3) Protein tyrosine kinases Enzymes that phosphorylate specific tyrosine residues of other proteins. (15.4) Proteins Structurally and functionally diverse group of polymers built of amino acid monomers. (2.5) Proteoglycan A protein-polysaccharide complex consisting of a core protein molecule to which chains of glycosaminoglycans are attached. Due to the acidic nature of the glycosaminoglycans, proteoglycans are capable of binding huge numbers of cations, which in turn draw huge numbers of water molecules. As a result, proteoglycans form a porous, hydrated gel that acts like a “packing” material to resist compression. (7.1) Proteome The entire inventory of proteins in a particular organism, cell type, or organelle. (2.5) Proteomics Expanding field of protein biochemistry that performs large-scale studies on diverse mixtures of proteins. (2.5) Protofilaments Longitudinally arranged rows of globular subunits of a microtubule that are aligned parallel to the long axis of the tubule. (9.3)

Proton-motive force (⌬p) An electrochemical gradient that is built up across energy-transducing membranes (inner mitochondrial membrane, thylakoid membrane, bacterial plasma membrane) following the translocation of protons during electron transport. The energy of the gradient, which is comprised of both a pH gradient and a voltage and is measured in volts, is utilized in the formation of ATP. (5.4) Proto-oncogenes A variety of genes that have the potential to subvert the cell’s own activities and push the cell toward the malignant state. Proto-oncogenes encode proteins that have various functions in a cell’s normal activities. Proto-oncogenes can be converted to oncogenes. (16.3) Protoplast A naked plant cell whose cell wall has been digested away by the enzyme cellulase. (18.5) Provirus The term for viral DNA when it has been integrated into the DNA of its host cell’s chromosome(s). (1.4) Pseudogenes Sequences that are clearly homologous to functional genes, but have accumulated mutations that render them nonfunctional. (10.5) Pseudopodia Broad, rounded protrusions formed during amoeboid movement as portions of the cell surface are pushed outward by a column of cytoplasm that flows through the interior of the cell toward the periphery. (9.7) Purine A class of nitrogenous base found in nucleotides that has a double-ring structure, including adenine and guanine, which are found in both DNA and RNA. (2.5, 10.3) Pyrimidine A class of nitrogenous base found in nucleotides that has a single-ring structure, including cytosine and thymine, which are found in DNA, and cytosine and uracil, which are found in RNA. (2.5, 10.3) Quality control Cells contain various mechanisms that ensure that the proteins and nucleic acids they synthesize have the appropriate structure. For example, misfolded proteins are translocated out of the ER and destroyed by proteasomes in the cytosol; mRNAs that contain premature termination codons are recognized and destroyed; and DNA containing abnormalities (lesions) are recognized and repaired. (e.g., 8.3) Quaternary structure The three-dimensional organization of a protein that consists of more than one polypeptide chain, or subunit. (2.5) Rabs A family of monomeric G proteins involved in vesicle trafficking. (8.5) Ran A GTP-binding protein that exists in an active GTP-bound form or an inactive GDP-bound form. Ran regulates nucleocytoplasmic transport. (12.2)

G-16 GLOSSARY Ras-MAP kinase cascade A cascade that is turned on in response to a wide variety of extracellular signals and plays a key role in regulating vital activities such as cell proliferation and differentiation. (15.4) rDNA The DNA sequences encoding rRNA that are normally repeated hundreds of times and are typically clustered in one or a few regions of the genome. (11.3) Reaction-center chlorophyll The single chlorophyll molecule of the several hundred or so in the photosynthetic unit that actually transfers electrons to an electron acceptor. (6.4) Reannealing (renaturation) Reassociation of complementary single strands of a DNA double helix that had been previously denatured. (10.3) Receptor Any substance that can bind to a specific molecule (ligand), often leading to uptake or signal transduction. (4.1, 15.1) Receptor protein-tyrosine kinases (or RTKs) Cell-surface receptors that, following ligand binding, can phosphorylate tyrosine residues on themselves and/or on cytoplasmic substrates. They are involved primarily in the control of cell growth and differentiation. (15.4) Recombinant DNA Molecules containing DNA sequences derived from more than one source. (18.12) Reducing agent The substance in a redox reaction that becomes oxidized, causing the other substance to become reduced. (3.3) Reducing power The potential in a cell to reduce metabolic intermediates into products, usually measured through the size of the NADPH pool. (3.3) Reduction The process through which an atom gains one or more electrons from another atom, in which the atom losing electrons is considered to be oxidized. (3.3) Refractory period The brief period of time following the end of an action potential during which an excitable cell cannot be restimulated to threshold. (4.8) Regulated secretion Discharge of materials synthesized in the cell that have been stored in membrane-bound secretory granules in the peripheral regions of the cytoplasm, occurring in response to an appropriate stimulus. (8.1) Regulatory gene Gene that codes for a bacterial repressor protein. (12.1) Renaturation (reannealing) Reassociation of complementary single-stranded DNA molecules that had been previously denatured. (10.4) Replica Metal-carbon cast of a tissue surface used in electron microscopy. Variations in the thickness of the metal in different parts of the replica cause variations in the number of penetrating electrons to reach the viewing screen. (18.2)

Replication Duplication of the genetic material. (13) Replication foci Localization of active replication forks in the cell nucleus. There are about 50 to 250 foci, each of which contains approximately 40 replication forks incorporating nucleotides into DNA strands simultaneously. (13.1) Replication forks The points at which the pair of replicated segments of DNA come together and join the nonreplicated segments. Each replication fork corresponds to a site where (1) the parental double helix is undergoing strand separation, and (2) nucleotides are being incorporated into the newly synthesized complementary strands. (13.1) Repressor A gene regulatory protein that binds to DNA and inhibits transcription. (12.2) Resolution The ability to see two neighboring points in the visual field as distinct entities. (18.1) Response elements The sites at which specific transcription factors bind to the regulatory regions of a gene. (12.4) Resting potential The electrical potential difference measured for an excitable cell when it is not subject to external stimulation. (4.8) Restriction endonucleases (restriction enzymes) Nucleases contained in bacteria that recognize short nucleotide sequences within duplex DNA and cleave the backbone at highly specific sites on both strands of the duplex. (18.12) Restriction map A type of physical map of the chromosome based on the identification and ordering of sets of fragments generated by restriction enzymes. (18.12) Retrotransposons Transposable elements that require reverse transcriptase for their movements within the genome. (10.5) Reverse transcriptase An RNA-dependent DNA polymerase. An enzyme that uses RNA as a template to synthesize a complementary strand of DNA. [An enzyme that is found in RNA-containing viruses and used in the laboratory to synthesize cDNAs.] (10.5) Ribonucleic acid (RNA) A single-stranded nucleic acid composed of a polymeric chain of ribose-containing nucleotides. (2.5) Ribosomal RNAs (or rRNAs) The RNAs of a ribosome. rRNAs recognize and bind other molecules, provide structural support, and catalyze the chemical reaction in which amino acids are covalently linked to one another. (11.1) Riboswitches mRNAs that, once bound to a metabolite, undergo a change in their folded conformation that allows them to alter the expression of a gene involved in production of that metabolite. Most riboswitches suppress gene expression by blocking either

termination of transcription or initiation of translation. (12.1) Ribozyme An RNA molecule that functions as a catalyst in cellular reactions. (2.5) RNA interference (RNAi) A naturally occurring phenomenon in which double-stranded RNAs (dsRNAs) lead to the degradation of mRNAs having identical sequences. RNAi is believed to function primarily in blocking the replication of viruses and restricting the movement of mobile elements, both of which involve the formation of dsRNA intermediates. Mammalian cells can be made to engage in RNAi by treatment of the cells with small (21 nt) RNAs. These small RNAs (siRNAs) induce the degradation of mRNAs that contain the same sequence. (11.5) RNA polymerase I The transcribing enzyme found in eukaryotic cells that synthesizes the large (28S, 18S, and 5.8S) ribosomal RNAs. (11.3) RNA polymerase II The transcribing enzyme found in eukaryotic cells that synthesizes messenger RNAs and most small nuclear RNAs. (11.4) RNA polymerase III The transcribing enzyme found in eukaryotic cells that synthesizes the various transfer RNAs and the 5S ribosomal RNA. (11.3) RNA silencing A process in which small, noncoding RNAs, typically derived from longer double-stranded precursors, trigger sequence-specific inhibition of gene expression. (11.5) RNA splicing The process of removing the intervening DNA sequences (introns) from a primary transcript. (11.4) RNA tumor viruses Retroviruses capable of infecting vertebrate cells, transforming them into cancer cells. RNA viruses have RNA in the mature virus particle. (16.2) RNA world A proposed stage in the early evolution of life before the appearance of DNA and proteins in which RNA molecules served both as genetic material and catalysts. (11.4) Rough endoplasmic reticulum (RER) That part of the endoplasmic reticulum that has ribosomes attached. The RER appears as an extensive membranous organelle composed primarily of flattened sacs (cisternae) separated by a cytosolic space. RER functions include synthesis of secretory proteins, lysosomal proteins, integral membrane proteins, and membrane lipids. (8.3) S phase The phase of the cell cycle in which replication occurs. (14.1) Saltatory conduction Propagation of a nerve impulse when one action potential triggers another at the adjacent stretch of unwrapped membrane (i.e., propagating by causing the

GLOSSARY G-17 action potentials to jump form one node of Ranvier to the next). (4.8) Sarcomeres Contractile units of myofibrils that are endowed with a characteristic pattern of bands and stripes that give skeletal muscle cells their striated appearance. (9.6) Sarcoplasmic reticulum (SR) A system of cytoplasmic, Ca2⫹-storing SER membranes in muscle cells that forms a membranous sleeve around the myofibril. (9.6) Saturated fatty acids Those lacking double bonds between carbons. (2.5) Second law of thermodynamics Events in the universe proceed from a state of higher energy to a state of lower energy. (3.1) Second messenger A substance that is formed in the cell as the result of the binding of a first messenger—a hormone or other ligand—to a receptor at the outer surface of the cell. (15.3) Secondary culture Transfer of previously cultured cells to a culture medium. (18.5) Secondary structure The three-dimensional arrangement of portions of a polypeptide chain. (2.5) Secondary walls Thicker cell walls found in most mature plant cells. (7.6) Secreted Discharged outside the cell. (8.1) Secretory granule Large, densely packed, membrane-bound structure containing highly concentrated secretory materials that are discharged into the extracellular space (secreted) following a stimulatory signal. (8.1, 8.5) Secretory pathway (biosynthetic pathway) Route through the cytoplasm by which materials are synthesized in the endoplasmic reticulum or Golgi complex, modified during passage through the Golgi complex, and transported within the cytoplasm to various destinations such as the plasma membrane, a lysosome, or a large vacuole of a plant cell. Many of the materials synthesized in the endoplasmic reticulum or Golgi complex are destined to be discharged outside the cell; hence the term secretory pathway has been used. (8.1) Section A very thin slice of tissue. (18.1) Selectins A family of integral membrane glycoproteins that recognize and bind to specific arrangements of carbohydrate groups projecting from the surface of other cells. (7.3) Selectively permeable barrier Any structure, such as a plasma membrane, that allows some substances to freely pass through while denying passage to others. (4.1) Self-assembly The property of proteins (or other structures) to assume the correct (native) conformation based on the chemical behavior dictated by the amino acid sequence. (2.5)

Semiconservative Replication in which each daughter cell receives one strand of the parent DNA helix. (13.1) Semipermeable The membrane property of being freely permeable to water while allowing much slower passage to small ions and polar solutes. (4.7) Serial sections A series of successive sections cut from a block of tissue. (18.1) SH2 domains Domains with high-affinity binding sites for phosphotyrosine motifs. Found in a variety of proteins involved in cell signaling. (15.4) side chain or R group The defining functional group of an amino acid, which can range from a single hydrogen to complex polar or non-polar units in the 20 amino acids most commonly found in cells. (2.5) Signal peptidase Proteolytic enzyme that removes the N-terminal portion including the signal peptide of a nascent polypeptide synthesized in the RER. (8.3) Signal recognition particle (SRP) A particle consisting of six distinct polypeptides and a small RNA molecule, called the 7S RNA, that recognizes the signal sequence as it emerges from the ribosome. SRP binds to the signal sequence and then to an ER membrane. (8.3) Signal sequence Special series of amino acids located at the N-terminal portion of newly forming proteins that triggers the attachment of the protein-forming ribosome to an ER membrane and the movement of the nascent polypeptide into the cisternal space of the ER. (8.3) Signal transduction The overall process in which information carried by extracellular messenger molecules is translated into changes that occur inside a cell. (15.1) Signaling pathways The information superhighways of the cell. Each consists of a series of distinct proteins that operate in sequence. Each protein in the pathway acts by altering the conformation of the downstream protein in the series. (15.1) Single nucleotide polymorphisms (SNPs) Sites in the genome where alternate bases are found with high frequency in the population. SNPs are excellent genetic markers for genome mapping studies. (10.6) Single-particle tracking (SPT) A technique for studying movement of membrane proteins that consists of two steps: (1) linking the protein molecules to visible substances such as colloidal gold particles and (2) monitoring the movements of the individual tagged particles under the microscope. (4.6) Single-stranded DNA-binding (or SSB) proteins Proteins that facilitate the separation of the DNA strands by their attachment to bare, single DNA strands,

keeping them in an extended state and preventing them from becoming rewound. (13.1) Site-directed mutagenesis A research technique to modify a gene in a predetermined way so as to produce a protein with a specifically altered amino acid sequence. (18.17) Sliding clamp A ring-shaped protein that plays a key role in DNA replication by encircling the DNA and imparting processivity to the replicative DNA polymerase. (13.1) Small interfering RNAs (siRNAs) Small (21-23 nucleotide), double-stranded fragments formed when double-stranded RNA initiates the response during RNA silencing. (11.5) Small nuclear RNAs (snRNAs) RNAs required for mRNA processing that are small (90 to 300 nucleotides long) and that function in the nucleus. (11.4) Small-nucleolar RNAs (snoRNAs) RNAs required for the methylation and pseudouridylation of pre-rRNAs during ribosome formation in the nucleolus. (11.3) snoRNPs (small, nucleolar ribonucleoproteins) Particles that are formed when snoRNAs are packaged with particluar proteins; snoRNPs play a role in the maturation and assembly of ribosomal RNAs. (11.3) Smooth endoplasmic reticulum (SER) That part of the endoplasmic reticulum that is without attached ribosomes. The membranous elements of the SER are typically tubular and form an interconnecting system of pipelines curving through the cytoplasm in which they occur. The SER functions vary from cell to cell and include the synthesis of steroid hormones, detoxification of a wide variety of organic compounds, mobilization of glucose from glucose 6-phosphate, and sequestration of calcium ions. (8.3) SNAREs Key proteins that mediate the process of membrane fusion. T-SNAREs are located in the membranes of target compartments. V-SNAREs incorporate into the membranes of transport vesicles during budding. (8.5) snRNPs Distinct ribonucleoprotein particles contained in spliceosomes, so called because they are composed of snRNAs bound to specific proteins. (11.4) Sodium-potassium pump (Na⫹/K⫹-ATPase) Transport protein that uses ATP as the energy source for transporting sodium and potassium ions, with the result that each conformational change transports three sodium ions out of the cell and two potassium ions into the cell. (4.7) Somatic cells Cells of the body, excluding cells of the germ line (i.e., those that can give rise to gametes).

G-18 GLOSSARY Specific activity The ratio of the amount of a protein of interest to the total amount of protein present in a sample, which is used as a measure of purification. (18.7) Specificity The property of selective interaction between components of a cell that is basic to life. (2.5) Spectrophotometer Instrument used to measure the amount of light of a specific wavelength that is absorbed by a solution. If one knows the absorbance characteristics of a particular type of molecule, then the amount of light of the appropriate wavelength absorbed by a solution of that molecule provides a sensitive measure of its concentration. (18.7) Sphingolipids A class of membrane lipids—derivations of sphingosine—that consist of sphingosine linked to a fatty acid by its amino group. (4.3) Spindle checkpoint A checkpoint that operates at the transition between metaphase and anaphase; the spindle checkpoint is best revealed when a chromosome fails to become aligned properly at the metaphase plate. (14.2) Splice sites The 5⬘ and 3⬘ ends of each intron. (11.4) Spliceosome A macromolecular complex containing a variety of proteins and a number of distinct ribonucleoprotein particles that functions in removal of introns from a primary transcript. (11.4) Split genes Genes with intervening sequences. (11.4) Spontaneous reactions Reactions that are thermodynamically favorable, capable of occurring without any input of external energy. (3.1) Sporophyte A diploid stage of the life cycle of plants that begins with the union of two gametes to form a zygote. During the sporophyte stage, meiosis occurs, producing spores that germinate directly into a haploid gametophyte. (14.3) SRP receptor Situated within the ER membrane, the SRP receptor binds specificially with the SRP-ribosome complex. (8.3) Standard free-energy change (⌬G⬚⬘) The change in free energy when one mole of each reactant is converted to one mole of each product under defined standard conditions: temperature of 298 K and pressure of 1 atm. (3.1) Starch Mixture of two glucose polymers, amylose and amylopectin, that serves as readily available chemical energy in most plant cells. (2.5) Steady state Metabolic condition in which concentrations of reactants and products are essentially constant, although individual reactions may not be at equilibrium. (3.1)

Stem cells Cells situated in various tissues of the body that constitute a reserve population capable of giving rise to the various cells of that tissue. Stem cells can be defined as undifferentiated cells that are capable of both (1) self-renewal, that is, production of cells like themselves, and (2) differentiation into two or more mature cell types. (HP1) Stereoisomers Two molecules that structurally are mirror images of each other and may have vastly different biological activity. (2.5) Steroid Lipid molecule based on a characteristic four-ring hydrocarbon skeleton, including cholesterol and hormones such as testosterone and progesterone. (2.5) Stomata Openings at the surface of leaves through which gas and water are exchanged between the plant and the air. (6.6) Stop codons Three of the 64 possible trinucleotide codons whose function is to terminate polypeptide assembly. (11.6) Stroma Space outside the thylakoid but within the relatively impermeable inner membrane of the chloroplast envelope. (6.1) Stroma thylakoids Also called stroma lamellae, these are flattened membranous cisternae that connect some of the thylakoids of one granum with those of another. (6.1) Structural genes Genes that code for protein molecules. (12.1) Structural isomers Molecules having the same chemical formula but different structures. (2.4) Subcellular fractionation An approach that allows different organelles (e.g. nucleus, mitochondrion, plasma membrane, endoplasmic reticulum) having different properties, to be separated from one another. (8.2) Substrate The reactant bound by an enzyme. (3.2) Substrate-level phosphorylation Direct synthesis of ATP through the transfer of a phosphate group from a substrate to ADP. (3.3) Subunit A polypeptide chain that associates with other chains (subunits) to form a complete protein or protein complex. (2.5) Supercoiled A molecule of DNA that has greater or fewer than 10 base pairs per turn of the helix. (10.3) Surface area/volume ratio The proportion between cellular dimensions that indicates how effectively (or whether) a cell can sustainably exchange substances with its environment. (1.3) Synapse The specialized junction of a neuron with its target cell. (4.8) Synapsis The process by which homologous chromosomes become joined to one another during meiosis. (14.3) Synaptic cleft The narrow gap between two excitable cells. A presynaptic cell conducts

impulses toward a synapse; a postsynaptic cell always lies on the receiving side of a synapase. (4.8) Synaptic vesicles The storage sites for neurotransmitter within the terminal knobs of a neuronal axon. (4.8) Synaptonemal complex (SC) A ladderlike structure composed of three parallel bars with many cross fibers. The SC holds each pair of homologous chromosomes in the proper position to allow the continuation of genetic recombination between strands of DNA. (14.3) tDNA The DNA encoding tRNAs. (11.3) T-cell receptor (TCR) Proteins present on the surface of T lymphocytes that mediate interaction with specific cell-bound antigens. Like the immunoglobulin of B cells, these proteins are formed by a process of DNA rearrangement that generates a specific antigen-combining site. TCRs consist of two subunits, each containing both a variable and a constant domain. (17.3) T lymphocytes (T cells) Lymphocytes that respond to antigen by proliferating and differentiating into either CTLs (cytotoxic lymphocytes) that attack and kill infected cells or TH cells that are required for antibody production by B cells. These cells attain their differentiated state in the thymus. (17.2) Tandem repeats A cluster in which a DNA sequence repeats itself over and over again without interruption. (10.3) Telomere An unusual stretch of repeated DNA sequences, which forms a “cap” at each end of a chromosome. (12.2) Telomerase A novel enzyme that can add new repeat units of DNA to the 3⬘ end of the overhanging strand of a telomere. Telomerase is a reverse transcriptase that synthesizes DNA using an RNA template. (12.2) Telophase The final stage of mitosis in which daughter cells return to the interphase condition: the mitotic spindle disassembles, the nuclear envelope reforms, and the chromosomes become more and more dispersed until they disappear from view under the microscope. (14.2) Temperature-sensitive (ts) mutations Mutations that are only expressed phenotypically when the cells (or organism) are grown at a higher (restrictive) temperature. At the lower (permissive) temperature, the encoded protein is able to hold together sufficiently well to carry out its activity, leading to a relatively normal phenotype. ts mutations are particularly useful for studying required activities such as secretion and replication, because “ordinary” mutations affecting these processes are typically lethal. (13.1)

GLOSSARY G-19 Template A single strand of DNA (or RNA) that contains the information (encoded as a nucleotide sequence) for construction of a complementary strand. (13.1) Tertiary structure The three dimensional shape of an entire macromolecule. (2.5) Tetrad (bivalent) The complex formed during meiosis by a pair of synapsed homologous chromosomes that includes four chromatids. (14.3) Thermodynamics Study of the changes in energy accompanying events in the physical universe. (3.1) Thermodynamics, first law of Principle of conservation of energy that states energy can neither be created nor destroyed, merely transduced (converted) from one form to another. (3.1) Thermodynamics, second law of Principle that all events move from a higher energy state to a lower energy state, and thus are spontaneous. (3.1) Thick filaments One of two distinct types of filaments that give sarcomeres their characteristic appearance. Thick filaments consist primarily of myosin and are surrounded by a hexagonal array of thin filaments. (9.6) Thin filaments One of two distinct types of filaments that give sarcomeres their characteristic appearance. Thin filaments consist primarily of actin and are arranged in a hexagonal array around each thick filament, with each thin filament situated between two thick filaments. (9.6) Threshold The point during depolarization of an excitable cell where voltage-gated sodium channels open, with the resulting Na⫹ influx causing a brief reversal in membrane potential. (4.8) Thylakoids Flattened, membranous sacs formed by the chloroplast’s internal membrane, which contain the energy-transducing machinery for photosynthesis. (6.1) Tight junctions Specialized contacts that occur at the very apical end of the junctional complex between adjacent epithelial cells. The adjoining membranes make contact at intermittent points, where integral proteins of the two adjacent membranes meet. (7.4) Toll-like receptors (TLRs) A type of pathogen receptor of the innate immune system. Humans express at least 10 functional TLRs, all of which are transmembrane proteins present on the surfaces of many different types of cells. (17.1) Tonoplast The membrane that bounds the vacuole of a plant cell. (8.7) Topoisomerases Enzymes found in both prokaryotic and eukaryotic cells that are able to change the supercoiled state of the DNA duplex. They are essential in

processes, such as DNA replication and transcription, that require the DNA duplex to unwind. (10.3) Trans cisternae The cisternae of the Golgi complex farthest from the endoplasmic reticulum. (8.4) Trans Golgi network (TGN) A network of interconnected tubular elements at the trans end of the Golgi complex that sorts and targets proteins for delivery to their ultimate cellular or extracellular destination. (8.4) Transcription The formation of a complementary RNA from a DNA template. (11.1) Transcription factors Auxiliary proteins (beyond the polypeptides that make up the RNA polymerases) that bind to specific sties in the DNA and alter the transcription of nearby genes. (11.2, 12.4) Transcription unit The corresponding segment of DNA on which a primary transcript is transcribed. (11.4) Transcriptional-level control Determination whether a particular gene can be transcribed and, if so, how often. (12.4) Transcriptome The entire inventory of RNAs transcribed by a particular cell, tissue, or organism. (11.5) Transduction The incorporation of a gene into a cellular genome by means of a virus. (18.7) Transfection A process by which naked DNA is introduced into cultured cells typically leading to the incorporation of the DNA into the cellular genome and its subsequent expression. (18.7) Transfer potential A measure of the ability of a molecule to transfer any group to another molecule, with molecules having a higher affinity for the group being the better acceptors and molecules having a lower affinity better donors. (3.3) Transfer RNAs (tRNAs) A family of small RNAs that translate the information encoded in the nucleotide “alphabet” of an mRNA into the amino acid “alphabet” of a polypeptide. (11.1) Transformation Uptake of naked DNA into a cell leading to a heritable change in the cell’s genome. (EP10) Transformed Normal cells that have been converted to cancer cells by treatment with a carcinogenic chemical, radiation, or an infective tumor virus. (16.1) Transgene A gene that has been stably incorporated into a cellular genome by the process of transfection. (18.7) Transgenic animals Animals that have been genetically engineered so that their chromosomes contain foreign genes. (18.12) Transition state The point during a chemical reaction at which bonds are being broken and reformed to yield products. (3.2)

Transition temperature The temperature at which a membrane is converted from a fluid state to a crystalline gel in which lipid-molecule movement is greatly reduced. (4.5) Translation Synthesis of proteins in the cytoplasm using the information encoded by an mRNA. (11.1, 11.8) Translational-level control Determination whether a particular mRNA is actually translated and, if so, how often and for how long a period. (12.6) Translesion synthesis Replication that bypasses a lesion in the template strand. Conducted by a special group of DNA polymerases that lack processivity, proofreading capacity, and high fidelity. (13.3) Translocation A chromosomal aberration that results when all or part of one chromosome becomes attached to another chromosome. (HP12) Translocation The step in the translation elongation cycle that involves (1) ejection of the uncharged tRNA from the P site and (2) the movement of the ribosome three nucleotides (one codon) along the mRNA in the 3⬘ direction. (11.8) Translocon A protein-lined channel embedded in the ER membrane; the nascent polypeptide is able to move through the translocon in its passage from the cytosol to the ER lumen. (8.3) Transmembrane domain The portion of a membrane protein that passes through the lipid bilayer, often composed of non-polar amino acids in an ␣-helical conformation. (4.4) Transmembrane signaling Transfer of information across the plasma membrane. (7.3, 15) Transmission electron microscopes (TEMs) Microscopes that form images from electrons that are transmitted through a specimen. (18.2) Transport vesicles The shuttles, formed by budding from a membrane compartment, that carry materials between organelles. (8.1) Transposable elements DNA segments that move from one place on a chromosome to a completely different site, often affecting gene expression. (10.5) Transposition Movement of DNA segments from one place on a chromosome to an entirely different site, often affecting gene expression. (10.5) Transposons DNA segments capable of moving from one place in the genome to another. (10.5) Transverse (T) tubules Membranous folds along which the impulse generated in a skeletal muscle cell is propagated into the interior of the cell. (9.6)

G-20 GLOSSARY Triacylglycerols Polymers consisting of a glycerol backbone linked by ester bonds to three fatty acids, commonly called fats. (2.5) Tricarboxylic acid cycle (TCA cycle) The circular metabolic pathway that oxidizes acetyl CoA, conserving its energy; the cycle is also known as the Krebs cycle or the citric acid cycle. (5.2) Trisomy A chromosome complement that has one extra chromosome, i.e., a third homologous chromosome. (HP14) Tubulin The protein that forms the walls of microtubules. Isoforms include ␣, ␤, and ␥ tubulin. (9.3) Tumor-suppressor genes Genes that encode proteins that restrain cell growth and prevent cells from becoming malignant. (16.3) Turgor pressure Hydrostatic pressure that builds up in a plant cell due to the hypertonic state of the intracellular compartment. Turgor pressure is exerted against the surrounding cell wall and provides support for plant tissues. (4.7) Turnover number The maximum number of substrate molecules that can be converted to product by one enzyme molecule per unit of time. (3.2) Turnover The regulated destruction of cellular materials and their replacement. (8.7) Ubiquinone A component of the electron transport chain, ubiquinone is a lipid-soluble molecule containing a long hydrophobic chain composed of five-carbon isoprenoid units. (5.3) Ubiquitin A small, highly conserved protein that is linked to proteins targeted for internalization by endocytosis or degradation in proteasomes. (8.8, 12.7) Unconventional myosins (See Myosin.) Unfolded protein response (UPR) A comprehensive response that occurs in cells whose ER cisternae contain an excessively high concentration of unfolded or misfolded proteins. Sensors that detect this situation trigger a pathway that leads to the synthesis

of proteins (e.g., molecular chaperones) that can alleviate the stress in the ER. (8.3) Unsaturated fatty acids Those having one or more double bonds between carbon atoms. (2.5) Untranslated regions (UTRs) Noncoding segments contained at both 5⬘ and 3⬘ ends of mRNAs. (12.6) Vacuole A single membrane-bound, fluid filled structure that comprises as much as 90% of the volume of many plant cells. (8.7) Van der Waals force A weak attractive force due to transient asymmetries of charge within adjacent atoms or molecules. (2.2) Variable regions The portions of light and heavy antibody polypeptide chains that differ in amino acid sequence from one specific antibody to another. (17.4) V(D)J joining The DNA rearrangements that occur during the development of B cells that limit the cells to the production of a specific antibody species. (17.4) Vector DNA A vehicle for carrying foreign DNA into a suitable host cell, such as the bacterium E. coli. The vector contains sequences that allow it to be replicated inside the host cell. Most often, the vector is a plasmid or the bacterial virus lambda (␭). Once the DNA is inside the bacterium, it is replicated and partitioned to the daughter cells. (18.12) Virion The form a virus assumes outside of a cell, which consists of a core of genetic material surrounded by a protein or lipoprotein capsule. (1.4) Viroids Small, obligatory intracellular pathogens, that, unlike viruses, consist only of an uncoated circle of genetic material, RNA. (1.4) Viruses Small, obligatory intracellular pathogens that are not considered to be alive because they cannot divide directly, which is required by the cell theory of life. (1.4)

Whole mount A specimen to be observed with a microscope that is an intact object, either living or dead, and can be an entire intact organism or a small part of a large organism. (18.1) Wild type The original strain of a living organism from which other organisms are bred for research. (10.2) Wobble hypothesis Crick’s proposal that the steric requirement between the anticodon of the tRNA and the codon of the mRNA is flexible at the third position, which allows two codons that differ only at the third position to share the same tRNA during protein synthesis. (11.7)

X-ray crystallography (X-ray diffraction) A technique that bombards protein crystals with a thin beam of X-rays of a single (monochromatic) wavelength. The radiation that is diffracted by the electrons of the protein atoms strikes a photographic plate or sensor. The diffraction pattern produced by the crystal is determined by the structure within the protein. (2.5, 18.8)

Yeast artificial chromosomes (YACs) Cloning elements that are artificial versions of a normal yeast chromosome. They contain all of the elements of a yeast chromosome that are necessary for the structure to be replicated during S phase and segregated to daughter cells during mitosis, plus a gene whose encoded product allows those cells containing the YAC to be selected from those that lack the element and the DNA fragment to be cloned. (18.15) Yeast two-hybrid system A technique used to search for protein-protein interactions. It depends on the expression of a reporter gene such as (␤)-galactosidase, whose activity is readily monitored by a test that detects a color change when the enzyme is present in a population of yeast cells. (2.5, 18.7)

Additional Readings (Additional readings for each chapter can be found on the book’s student companion site on the Web at www.wiley.com/college/karp.) CHAPTER 1 TEXT READINGS General References in Microbiology and Virology Knipe, D. M., et al. 2007. Fields Virology, 5th ed. Lippincott. Madigan, M. T., et al. 2011. Brock—Biology of Microorganisms, 13th ed. Benjamin Cummings.

Other Readings Buchen, L. 2010. The new germ theory. Nature 468:492–495. [GI microbes] Cherry, A. B. C. & Daley, G. Q. 2012. Reprogramming cellular identity for regenerative medicine. Cell 148:1110–1122. Cho, M. K. & Relman, D. A. 2010. Synthetic “life,” ethics, national security, and public discourse. Science 329:38–39. Hanna, J. H., et al., 2010. Pluripotency and cellular reprogramming: facts, hypotheses, unresolved issues. Cell 143:508–525. Hayden, E. C. 2010. Life is complicated. Nature 464:664–667. Hayden, E. C. 2011. The growing pains of pluripotency. Nature 473:272–274. Janssens, S. 2010. Stem cells in the treatment of heart disease. Ann. Rev. Med. 61:287–300. Keasling, J. D. 2010. Manufacturing molecules through metabolic engineering. Science 330:1355–1358. Koonin, E. V. 2010. The incredible expanding ancestor of eukaryotes. Cell 140:606–608. Mascarelli, A. 2009. Low life. Nature 459:770–773. [soil microbes] Nicholas, C. R. & Kriegstein, A. R. 2010. Cell reprogramming gets direct. Nature 463:1031–1032. Pera, M. F. 2011. The dark side of pluripotency. Nature 471:46–47 Pennisi, E. 2010. Synthetic genome brings new life to bacterium. Science 328:958–959. Shevde, N. 2012. Flexible friends. Nature 483 (March 1): S22–S26. [cell reprogramming] Sonnenburg, J. L. 2010. Genetic pot luck. Nature 464:837–838. [GI microbes] Tiscornia, G., et al., 2011. Diseases in a dish: modeling human genetic disorders using iPS cells. Nature Med. 17:1570–1576. Wu, S. M. & Hochedlinger, K. 2011. Harnessing the potential of induced pluripotent stem cells for regenerative medicine. Nature Cell Biol. 13:497–505. Yamanaka, S. & Blau, H. M. 2010. Nuclear reprogramming to a pluripotent state by three approaches. Nature 465:704–712.

CHAPTER 2 TEXT READINGS General Biochemistry BERG, J. M., TYMOCZKO, J.L. & STRYER, L. 2010. Biochemistry, 7th ed. W. H. Freeman. NELSON, D. L., COX, M. M. 2009. Lehninger Principles of Biochemistry, 5th ed. W. H. Freeman. VOET, D. & VOET, J. G. 2010. Biochemistry 4th ed. Wiley.

Other Readings Bornscheuer, U. T., et al., 2012. Engineering the third wave of biocatalysis. Nature 485:185–194. Brody, H., et al., 2011. Nature outlook: Alzheimer’s disease. Nature 475:S1–S22 (7/14 issue). Chouard, T. 2011. Breaking the protein rules. Nature 471:151–153. Citron, M. 2010. Alzheimer’s disease: strategies for disease modification. Nature Revs. Drug Disc. 9:387–399. Cushman, M., et al., 2010. Prion-like disorders: blurring the divide between transmissibility and infectivity. J. Cell Science 123:1191–1201. Dalby, P. A. 2011. Strategy and success for the directed evolution of enzymes. Curr. Opin. Struct. Biol. 21:473–480. Hekimi, S., et al., 2011. Taking a “good” look at free radicals in the aging process. Trends Cell Biol. 21:569–575. Huang, Y. & Mucke, L. 2012. Alzheimer mechanisms and therapeutic strategies. Cell 148:1204–1222. Itzhaki, L. S., et al., 2012. Protein folding and binding. Curr. Opin. Struct. Biol. 22, #1. Miller, G. 2009. Alzheimer’s biomarker initiative hits its stride. Science 326:386–389. Karran, E., et al., 2011. The amyloid cascade hypothesis for Alzheimer’s disease: an appraisal for the development of therapeutics. Nature Revs. Drug Disc. 10:698–712. Pearson, H. 2012. Raising the dead. Nature 483:390–393. [ancestral protein evolution] Sosnick, T. R. & Barrick, D. 2011. The folding of single domain proteins—have we reached a consensus? Curr. Opin. Struct. Biol. 21:12–24. Thompson, C. B. 2009. Attacking cancer at its root. Cell 138:1051–1054. [Gleevec development] Vinson, V. J., et al., 2009. Proteins in motion. Science 324:197–215.

CHAPTER 3 TEXT READINGS BENKOVIC, S. J. & HAMMES-SCHIFFER, S. 2003. A perspective on enzyme catalysis. Science 301:1196–1202. FISCHBACH, M. A. & WALSH, C. T. 2009. Antibiotics for emerging pathogens. Science 325:1089–1093. HAMMES, G. G. 2000. Thermodynamics and Kinetics for the Biological Sciences. Wiley. Hammes, G. G. 2008. How do enzymes really work? J. Biol. Chem. 283:22337–22346. Hardie, D. G., et al., 2012. AMPK: a nutrient and energy sensor that maintains energy homeostasis. Nature Revs. Mol. Cell Biol. 13:251–262. HAROLD, F. M. 1986. The Vital Force: A Study of Bioenergetics. Freeman. HARRIS, D. A. 1995. Bioenergetics at a Glance. Blackwell. JENCKS, W. P. 1997. From chemistry to biochemistry to catalysis to movement. Annu. Rev. Biochem. 66:1–18. KORNBERG, A. 1989. For the Love of Enzymes. Harvard. Koshland, D. E., Jr. 2004. Crazy, but correct. Nature 432:447. [on postulation of induced fit hypothesis] Kraut, D. A., et al., 2003. Challenges in enzyme mechanism and energetics. Annu. Rev. Biochem. 72:517–571. KRAUT, J. 1988. How do enzymes work? Science 242:533–540. Nikaido, H. 2009. Multidrug resistance in bacteria. Ann. Rev. Biochem. 78:119–146. Ringe, D. & Petsko, G. A. 2008. How enzymes work. Science 320:1428–1429. Schramm, V. L. 2011. Enymatic transition states, transition-state analogs, dynamics, thermodynamics, and lifetimes. Ann. Rev. Biochem. 80:703–732. Vrielink, A. & Sampson, N. 2003. SubÅngstrom resolution enzyme X-ray structures: is seeing believing? Curr. Opin. Struct. Biol. 13:709–715. WALSH, C., ET AL., 2001. Reviews on biocatalysis. Nature 409:226–268.

CHAPTER 4 TEXT READINGS Reviews on membranes can be found each year in Curr. Opin. Struct. Biol. issue #4 Bogdanov, M., et al., 2009. Lipid-protein interactions drive membrane protein topogenesis in accordance with the positiveinside rule. J. Biol. Chem. 284:9637–9641.

A-1

A-2 ADDITIONAL READINGS Boudker, O. & Verdon, G. 2010. Structural perspectives on secondary active transporters. Trends. Pharmacol. Sci. 31:418–426. Bublitz, M., et al. 2011. P-type ATPases at a glance. J. Cell Sci. 124:2515–2519. Gadsby, D. C. 2009. Ion channels versus ion pumps: the principal difference, in principle. Nature Revs. Mol. Cell Biol. 10:344–352. Khalili-Araghi, F., et al., 2009. Molecular dynamics simulations of membrane channels and transporters. Curr. Opin. Struct. Biol. 19:128–137. Kusumi, A., et al., 2011. Hierarchical mesoscale domain organization of the plasma membrane. Trends Biochem. Sci. 36:604–615. Lee, A. G. 2011. Biological membranes: the importance of molecular detail. Trends Biochem. Sci. 36:493=502. London, E. Shahidullah, K. 2009. Transmembrane vs. non-transmembrane hydrophobic helix topography in model and natural membranes. Curr. Opin. Struct. Biol. 19:464–472. Morth, J. P., et al., 2011. A structural overview of the plasma membrane Na⫹, K⫹-ATPase and H⫹-ATPase ion pumps. Nature Revs. Mol. Cell Biol. 12:60–70. Shevchenko, A. & Simons, K. 2010. Lipidomics: coming to grips with lipid diversity. Nature Revs. Mol. Cell Biol. 11:593–598. Simons, K. & Gerl, M. J. 2010. Revitalizing membrane rafts: new tools and insights. Nature Revs. Mol. Cell Biol. 11:688–699. Tate, C. G. & Stevens, R. C., eds. 2010. Membranes. Curr. Opin. Struct. Biol. 20, #4. von Heijne, G. 2006. Membrane-protein topology. Nat. Revs. Mol. Cell Biol. 7:909–918. White, S. H., et al., 2009. Protein biophysics. Nature 459:343–385.

CHAPTER 5 TEXT READINGS Chan, S. I. 2010. Proton pumping in cytochrome c oxidase: the coupling between proton and electron gating. PNAS 107:8505–8506. Efremov, R. G. & Sazanov, L. A. 2011. Respiratory complex I: “steam engine” of the cell. Curr. Opin. Struct. Biol. 21:532–540. Farmer, S R. 2009. Obesity: Be cool, lose weight. Nature 458:839–840. [UCP1 and BAT] Fischer, W. W. 2008. Life before the rise of oxygen. Nature 455:1051–1052. Ferguson, S. J. 2010. ATP synthase: from sequence to ring size to the P:O ratio. PNAS 107:16755–16756. Junge, W. & Müller, D. J. 2011. Seeing a molecular motor at work. Science 333:704–705.

Kageyama, Y., et al., 2011. Mitochondrial division: molecular machinery and physiological functions. Curr. Opin. Cell Biol. 23:427–434. Nunnari, J. & Suomalainen, A. 2012. Mitochondria: in sickness and in health. Cell 148:1145–1159. Ohnishi, T. 2010. Piston drives a proton pump. Nature 465:428–429. Park, C. B. & Larsson, N.-G. 2011. Mitochondrial DNA mutations in disease and aging. J. Cell Biol. 193:809–818. von Ballmoos, C., et al., 2008. Unique rotary ATP synthase and its biological diversity. Annu. Rev. Biophys. 37:43–64. von Ballmoos, C., et al., 2009. Essentials for ATP synthesis by F1-F0 ATP synthases. Ann. Rev. Biochem. 78:649–672. Wallace, D. C. & Fan, W. 2009. The pathophysiology of mitochondrial disease as modeled in the mouse. Genes Develop. 23:1714–1736. Westermann, B. 2010. Mitochondrial fusion and fission in cell life and death. Nature Revs. Mol. Cell Biol. 11:872–884.

CHAPTER 6 TEXT READINGS Allen, J. F. & Martin, W. 2007. Out of thin air. Nature 445:610–612. [evolution of photosynthesis] Eberhard, S., et al., 2008. The dynamics of photosynthesis. Ann. Rev. Gen. 42:463–515. Nelson, N. & Ben-Shem, A. 2004. The complex architecture of oxygenic photosynthesis. Nature Revs. Mol. Cell Biol. 5:971–982. Nelson, N. & Yocum, C. F. 2006. Structure and function of photosystems I and II. Annu. Rev. Plant Biol. 57:521–565. Shikanai, T. 2007. Cyclic electron transport around photosystem I: genetic approaches. Annu. Rev. Plant Biol. 58:199–217. West-Eberhard, M. J., et al., 2011. Photosynthesis, reorganized. Science 332:311–312. [evolution of C4 and CAM plants]

CHAPTER 7 TEXT READINGS Reviews on cell-to-cell contact and extracellular matrix can be found each year in Curr. Opin. Cell Biol. issue #5 Cox, D., et al., 2010. Integrins as therapeutic targets. Nature Revs. Drug Disc. 9:804–820. Desai, B.V., et al., 2009. Desmosomes at a glance. J. Cell Sci. 122:4401–4407. Desgrosellier, J.S. & Cheresh, D.A. 2010. Integrins in cancer. Nature Revs. Cancer 10:9–22. Harris, T.J.C. & Tepass, U. 2010. Adherens junctions: from molecules to morphogenesis. Nature Revs. Mol. Cell Biol. 11:502–514.

Kadler, K. E., et al., 2007. Collagens at a glance. J. Cell Sci. 120:1955–1958. Kessenbrock, K., et al., 2010. Matrix metalloproteinases: regulators of the tumor microenvironment. Cell 141:52–67. Kim, C., et al., 2011. Regulation of integrin activation. Ann. Rev. Cell Dev. Biol. 27:321–345. Leckband, D.E., et al., 2011. Mechanotransduction at cadherin-mediated adhesions. Curr. Opin. Cell Biol. 23:523–530. Moser, M., et al., 2009. The tail of integrins, talin, and kindlins. Science 324:895–899. Nieto, M.A. 2011. The ins and outs of the epithelial to mesenchymal transition in health and disease. Ann. Rev. Cell Dev. Biol. 27:347–376. Schwartz, M. A. 2009. The force is with us. Science 323:588–589. [forces on focal adhesions] Shattil, S.J., et al. 2010. The final steps of integrin activation. Nature Revs. Mol. Cell Biol. 11:288–300. Sonnenberg, A. & Watt, F. M., eds. 2009. Special issue on integrins. J. Cell. Sci. 122:#2. Steed, E., et al., 2010. Dynamics and functions of tight junctions. Trends Cell Biol. 20:142–149. Valastyan, S. & Weinberg, R.A. 2011. Tumor metastasis: molecular insights and evolving paradigms. Cell 147:275–292. Zaidel-Bar, R. & Geiger, B. 2010. The switchable integrin adhesome. J. Cell Sci. 123:1385–1388.

CHAPTER 8 TEXT READINGS Reviews on endomembranes and organelles can be found each year in Curr. Opin. Cell Biol. issue #4 Boettner, D.R., et al., 2012. Focus on membrane dynamics. Nature Cell Biol. 14:#1. Chacinska, A., et al., 2009. Importing mitochondrial proteins: machineries and mechanisms. Cell 138:628–644. Dalal, K. 2011. The SecY complex: conducting the orchestra of protein translocation. Trends Cell Biol. 21:506–513. Ferguson, S.M. & De Camilli, P. 2012. Dynamin, a membrane-remodelling GTPase. Nature Revs. Mol. Cell Biol. 13:75–88. Frost, A., et al., 2009. The BAR domain superfamily: membrane-molding macromolecules. Cell 137:191–196. Glick, B.S. & Nakano, A. 2009. Membrane traffic within the Golgi apparatus. Ann. Rev. Cell Dev. Biol. 25:113–132. Hsu, V.W., et al., 2012. Getting active: protein sorting in endocytic recycling. Nature Revs. Mol. Cell Biol. 13:323–328. Hurley, J.H., et al., 2010. Membrane budding. Cell 143:875–887. Jensen, D. & Schekman, R. 2011. COPIImediated vesicle formation at a glance. J. Cell Sci. 124:1–4.

ADDITIONAL READINGS A-3 Lev, S. 2010. Non-vesicular lipid transport by lipid-transfer proteins and beyond. Nature Revs. Mol. Cell Biol. 11:739–750. Libby, P., et al.,2011. Progress and challenges in translating the biology of atherosclerosis. Nature 473:317–325. Pfeffer, S. & Novick, P., eds. 2010. Reviews on autophagy. Curr. Opin. Cell Biol. 22:#4. Prinz, W.A. 2010. Lipid trafficking sans vesicles: where, why, how? Cell 143:870–874. Raiborg, C. & Stenmark, H. 2009. The ESCRT machinery in endosomal sorting of ubiquitylated membrane proteins. Nature 458:445–452. Schmidt, O., et al., 2010. Mitochondrial protein import: from proteomics to functional mechanisms. Nature Revs. Mol. Cell Biol. 11:655–667. Shao, S. & Hegde, R.S. 2011. Membrane protein insertion at the endoplasmic reticulum. Ann. Rev. Cell Dev. Biol. 27:25–56. Stenmark, H. 2009. Rab GTPases as coordinators of vesicle traffic. Nature Revs. Mol. Cell Biol. 10:513–526. Traub, L.M., et al., 2009. Reviews of endocytosis. Nature Revs. Mol. Cell Biol. 10:#9. Walter, P. & Ron, D. 2011. The unfolded protein response: from stress pathway to homeostatic regulation. Science 334:1081–1086. Yang, Z. 2010. Focus on autophagy. Nature Cell Biol. 12:#9.

CHAPTER 9 TEXT READINGS Reviews on the cytoskeleton and motor proteins can be found each year in Curr. Opin. Cell Biol. issue #1 Akhmanova, A. & Steinmetz, M.O. 2010. Microtubule ⫹ TIPs at a glance. J. Cell Sci. 123:3415–3419. Gardel, M.L., et al., 2010. Mechanical integration of actin and adhesion dynamics in cell migration. Ann. Rev. Cell Dev. Biol. 26:315–333. Goetz, S.C. & Anderson, K.V. 2010. The primary cilium: a signalling centre during vertebrate development. Nature Revs. Gen. 11:331–344. Hammer III, J.A. & Sellers, J.R. 2012. Walking to work: roles for class V myosins as cargo transporters. Nature Revs. Mol. Cell Biol. 13:13–26. Hartman, M.A. & Spudich, J.A. 2012. The myosin superfamily at a glance. J. Cell Sci. 125:1627–1632. Hirokawa, N., et al., 2009. Kinesin superfamily motor proteins and intracellular transport. Nature Revs. Mol. Cell Biol. 10:682–696. Kollman, J.M., et al., 2011. Microtubule nucleation by ␥-tubulin complexes. Nature Revs. Mol. Cell Biol. 12:709–721.

Kritikou, E., et al., 2008. Milestone papers on the cytoskeleton. Nature Suppl. December. Lindemann, C.B. & Lesich, K.A. 2010. Flagellar and ciliary beating: the proven and the possible. J. Cell Sci. 123:519–528. Mostowy, S. & Cossart, P. 2012. Septins: the fourth component of the cytoskeleton. Nat. Revs. Mol. Cell Biol. 13:183–194. Pollard, T. D. 2008. Regulation of actin filament assembly by Arp2/3 complex and formins. Ann. Rev. Biophys. Biomol. Struct. 36:451–477. Ridley, A.J. 2011. Life at the leading edge. Cell 145:1012–1022. [cell migration] Saxton, W. & Hollenbeck, P.J. 2012. The axonal transport of mitochondria. J. Cell Sci. 125:2095–2104. Sweeney, H.L. & Houdusse, A. 2010. Structural and functional insights into the myosin motor mechanism. Ann. Rev. Biophys. 39:539–557. van den Heuvel, M. G. L. & Dekker, C. 2007. Motor proteins at work for nanotechnology. Science 317:333–336. Walter, W.J. & Diez, S. 2012. A staggering giant. Nature 482:44–45. [dynein stepping] Windoffer, R., et al., 2011. Cytoskeleton in motion: the dynamics of keratin intermediate filaments in epithelia. J. Cell Biol. 194:669–678.

CHAPTER 10 TEXT READINGS Reviews on genomes and evolution can be found each year in Curr. Opin. Genetics Develop. #6 Barbujani, G. & Colonna, V. 2010. Human genome diversity: frequently asked questions. Trends Gen. 26:285–295. Callaway, E. 2011. Ancient DNA reveals secrets of human history. Nature 476:136–137. Cordaux, R. & Batzer, M. A. 2009. The impact of retrotransposons on human genome evolution. Nat. Revs. Gen. 10:691–703. Daly, A. K. 2010. Genome-wide association studies in pharmacogenomics. Nat. Revs. Gen. 11:241–246. Frazer, K. A., et al., 2009. Human genetic variation and its contribution to complex traits. Nat. Revs. Gen. 10:241–251. Gibbons, A. 2010. Tracing evolution’s recent fingerprints. Science 329:740–742. Gibson, G. 2012. Rare and common variants: twenty arguments. Nature Revs. Gen. 13:135–145. [missing heritability] Kiezun, A., et al., 2012. Exome sequencing and the genetic basis of complex traits. Nat. Gen. 44:623–630. La Spada, A. R. & Taylor, J. P. 2010. Repeat expansion disease: progress and puzzles in disease pathogenesis. Nat. Revs. Gen. 11:247–258.

Lander, E. S. 2011. Initial impact of the sequencing of the human genome. Nature 470:187–197. Lupski, J. R., et al., 2011. Three perspectives on genomics and human disease. Cell 147:32–69. Malhotra, D. & Sebat, J. 2012. CNVs: harbingers of a rare variant revolution in psychiatric genetics. Cell 148:1223–1241. Manolio, T. A., et al., 2009. Finding the missing heritability of complex diseases. Nature 461:747–753. Manolio, T. A. & Collins, F. S. 2009. The HapMap and genome-wide association studies in diagnosis and therapy. Ann. Rev. Med. 60:443–456. McClellan, J. & King, M. C. 2010. Genetic heterogeneity in human disease. Cell 141:210–217. Monroe, D. 2009. Genomic clues to DNA treasure sometimes lead nowhere. Science 325:142–143. Orr, H. T. 2009. Unstable nucleotide repeat minireview series. J. Biol. Chem. 284:7405–7435. Pennisi, E. 2009. Tales of a prehistoric human genome. Science 323:866–871. [Neanderthal genome] Pennisi, E. 2011. Green genomes. Science 332:1372–1375. [plant genomes] Vos, S. M., et al., 2011. All tangled up: how cells direct, manage and exploit topoisomerase function. Nat. Revs. Mol. Cell Biol. 12:827–841. Yandell, M. & Ence, D. 2012. A beginner’s guide to eukaryotic genome annotation. Nature Revs. Gen. 13:329–342.

CHAPTER 11 TEXT READING Reviews on the nucleus and gene expression can be found each year in Curr. Opin. Cell Biol. issue #3 Cheung, A. C. M. & Cramer, P. 2012. A movie of RNA polymerase II transcription. Cell 149:1431–1437. Cramer, P. & Arnold, E., eds. 2009. How RNA polymerases work. Curr. Opin. Struct. Biol. 19:680–782. Czech, B. & Hannon, G. J. 2011. Small RNA sorting: matchmaking for Argonautes. Nat. Revs. Gen. 12:19–31. Dunkle, J. A. & Cate, J. H. D. 2010. Ribosome structure and dynamics during translocation and termination. Ann. Rev. Biophys. 39:227–244. Hoskins, A. A. & Moore, M. J. 2012. The spliceosome: a flexible, reversible macromolecular machine. Trends Biochem. Sci. 37:179–188. Jackson, R. J., et al., 2010. The mechanism of eukaryotic translation initiation and principles of its regulation. Nat. Revs. Mol. Cell Biol. 11:113–127.

A-4 ADDITIONAL READINGS Jacquier, A. 2009. The complex eukaryotic transcriptome: unexpected pervasive transcription and novel small RNAs. Nat. Revs. Gen. 10: 833–844. Kawamata, T. & Tomari, Y. 2010. Making RISC. Trends Biochem. Sci. 35:368–376, Klinge, S., et al., 2012. Atomic structures of the eukaryotic ribosome. Trends Biochem. Sci. 37:189–198. Krol, J., et al., 2010. The widespread regulation of microRNA biogenesis, function and decay. Nature Revs. Gen. 11:597–610. Kuehner, J. N., et al., 2011. Unraveling the means to an end: RNA polymerase II transcription termination. Nat. Revs. Mol. Cell Biol. 12:283–294. Nudler, E. 2009. RNA polymerase active center: the molecular engine of transcription. Ann. Rev. Biochem. 78:335–361. Proudfoot, N. J. 2011. Ending the message: poly(A) signals then and now. Genes Develop. 25:1770–1782. Schmeing, T. M. & Ramakrishnan, V. 2009. What recent ribosome structures have revealed about the mechanism of translation. Nature 461:1234–1242. Sharp, P. A., et al., 2009. Special review issue on RNA. Cell 136#4. Siomi, M. C., et al. 2011. PIWI-interacting small RNAs: the vanguard of genome defence. Nat. Revs. Mol. Cell Biol. 12:246–258. van Hoof, A. & Wagner, E. J. 2011. A brief survey of mRNA surveillance. Trends Biochem. Sci. 36:585–592. Wilson, T. J. & Lilley, D. M. J. 2009. The evolution of ribozyme chemistry. Science 323:1436–1438. Winter, J., et al., 2009. Many roads to maturity: microRNA biogenesis pathways and their regulation. Nature Cell Biol. 11:228–234.

CHAPTER 12 TEXT READING Reviews on the nucleus and gene regulation can be found each year in Curr. Opin. Cell Biol. #3. Reviews on chromosomes and gene expression can be found each year in Curr. Opin. Genetics and Develop. #2 Bose, T. & Gerton, J. L. 2010. Cohesinopathies, gene expression, and chromatin organization. J. Cell Biol. 189:201–210. Bowman, G. D. 2010. Mechanisms of ATPdependant nucleosome sliding. Curr. Opin. Struct. Biol. 20:73–81. Bulger, M. & Groudine, M. 2011. Functional and mechanistic diversity of distal transcription enhancers. Cell 144:327–339. Fabian, M. R., et al., 2010. Regulation of mRNA translation and stability by microRNAs. Ann. Rev. Biochem. 79:351–379.

Farnham, P. J. 2009. Insights from genomic profiling of transcription factors. Nature Revs. Gen. 10:605–616. Ferguson-Smith, A. C. 2011. Genomic imprinting: the emergence of an epigenetic paradigm. Nature Revs. Gen. 12:565–575. Fuda, N. J., et al., 2009. Nature insight: Transcribing the genome. Nature 461:185–223. Greer, E. L. & Shi, Y. 2012. Histone methylation: a dynamic mark in health, disease, and inheritance. Nature Revs. Gen. 13:343–357. Grünwald, D., et al., 2011. Nuclear export dynamics of RNA-protein complexes. Nature 475:333–341. Jiang, C. & Pugh, B. F. 2009. Nucleosome positioning and gene regulation. Nature Revs. Gen. 10:161–172. Jones, P. A. 2012. Functions of DNA methylation. Nature Revs. Gen. 13:484–492. Kugel, J. F. & Goodrich, J. A. 2012. Noncoding RNAs: key regulators of mammalian transcription. Trends Biochem. Sci. 37:144–151. Li, J. & Gilmour, D. S. 2011. Promoter proximal pausing and the control of gene expression. Curr. Opin. Gen. Develop. 21:231–235. Luo, Z., et al., 2012. The super elongation complex (SEC) family in transcriptional control. Nature Revs. Mol. Cell Biol. 13:543–548. Malik, H. S. & Henikoff, S. 2009. Major evolutionary transitions in centromere complexity. Cell 138:1067–1082. Mattick, J. S., et al., 2009. RNA regulation of epigenetic processes. Bioess. 31:51–59. Moazed, D. 2011. Mechanisms for the inheritance of chromatin states. Cell 146:510–518. Nilsen, T. W. & Graveley, B. R. 2010. Expansion of the eukaryotic proteome by alternative splicing. Nature 463:457–463. Ong, C.-T. & Corces, V. G. 2011. Enhancer function: new insights into the regulation of tissue-specific gene expression. Nature Revs. Gen. 12:283–293. Osterhage, J. L. & Friedman, K. L. 2009. Chromosome end maintenance by telomerase. J. Biol. Chem. 284:16061–16065. Pauli, A., et al., 2011. Non-coding RNAs as regulators of embryogenesis. Nature Revs. Gen. 12:136–149. Riddihough, G., et al., 2010. What is epigenetics? Science 330:611–632. Schoenfelder, S., et al., 2010. The transcriptional interactome: gene expression in 3D. Curr. Opin. Gen. Develop. 20:127–133. Spitz, F. & Furlong, E. E. M. 2012. Transcription factors: from enhancer binding to developmental control. Nature Revs. Gen. 13:613–626.

Strambio-De-Castillia, C., et al., 2010. The nuclear pore complex: bridging nuclear transport and gene regulation. Nat. Revs. Mol. Cell Biol. 11:490–501. Suganuma, T. & Workman, J. L. 2011. Signals and combinatorial functions of histone modifications. Ann. Rev. Biochem. 80:473–499. Taatjes, D. J. 2010. The human Mediator complex: a versatile, genome-wide regulator of transcription. Trends Biochem. Sci. 35:315–322. Vogel, C. & Marcotte, E. M. 2012. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nature Revs. Gen. 13:227–232. Weake, V. M. & Workman, J. L. 2010. Inducible gene expression: diverse regulatory mechanisms. Nature Revs. Gen. 11:426–437. Wilson, K. L. & Berk, J. M. 2010. The nuclear envelope at a glance. J. Cell Sci. 123:1973–1978. Yamanaka, S. & Blau, H. M. 2010. Nuclear reprogramming to a pluripotent state by three approaches. Nature 465:704–712. Zhou, V. W., et al., 2011. Charting histone modifications and the functional organization of mammalian genomes. Nature Revs. Gen. 12:7–18.

CHAPTER 13 TEXT READING Alabert, C. & Groth, A. 2012. Chromatin replication and epigenome maintenance. Nat. Revs. Mol. Cell Biol. 13:153–167. Balakrishnan, L. & Bambara, R. A. 2011. Eukaryotic lagging strand DNA replication employs a multi-pathway mechanism that protects genome integrity. J. Biol. Chem. 286:6865–6870. Bernstein, K. A. & Rothstein, R. 2009. At loose ends: resecting a double-strand break. Cell 137:807–810. Broyde, S. & Patel, D. J. 2010. How to accurately bypass damage. Nature 465:1023–1024. Cleaver, J. E., et al., 2009. Disorders of nucleotide excision repair. Nat. Revs. Gen. 10:756–768. Corpet, A. & Almouzni, G. 2009. Making copies of chromatin: the challenge of nucleosomal organization and epigenetic information. Trends Cell Biol. 19:29–40. Gilbert, D.M. 2010. Evaluating genome-scale approaches to eukaryotic DNA replication. Nat. Revs. Gen. 11:673–684. Hamdan, S. M. & van Oijen, A. M. 2010. Timing, coordination, and rhythm: acrobatics at the DNA replication fork. J. Biol. Chem. 285:18979–18983. Lukas, J., et al., 2011. More than just a focus: the chromatin response to DNA damage and its role in genome integrity maintenance. Nat. Cell Biol. 13:1161–1169.

ADDITIONAL READINGS A-5 Masai, H., et al., 2010. Eukaryotic chromosome DNA replication. Were, when, and how? Ann. Rev. Biochem. 79:89–130. Méchali, M. 2010. Eukaryotic DNA replication origins: many choices for appropriate answeres. Nat. Revs. Mol. Cell Biol. 11:728–738. Misteli, T. & Soutoglou, E. 2009. The emerging role of nucler architecture in DNA repair and genome maintenance. Nat. Revs. Mol. Cell Biol. 10:243–254. Ransom, M., et al., 2010. Chaperoning histones during DNA replication and repair. Cell 140:183–195. Yao, N. Y. & O’Donnell, M. 2009. Replisome structure and conformational dynamics underlie fork progression past obstacles. Curr. Opin. Cell Biol. 21:336–343.

O’Connell, C. B., et al., 2012. Reviews on the kinetochore. Curr. Opin. Cell Biol. 24:40–70. Pollard, T. D. 2010. Mechanics of cytokinesis in eukaryotes. Curr. Opin. Cell Biol. 22:50–56. Przewloka, M. R. & Glover, D. M. 2009. The kinetochore and the centromere: a working long distance relationship. Ann. Rev. Gen. 43:439–465. Reinhardt, H. C. & Yaffe, M. B. 2009. Kinases that control the cell cycle in response to DNA damage: Chk1, Chk2, and MK2. Curr. Opin. Cell Biol. 21:245–255. Skibbens, R. V. 2010. Buck the establishment: reinventing sister chromatid cohesion. Trends Cell Biol. 20:507–513. Walczak, C. E., et al., 2010. Mechanisms of chromosome behaviour during mitosis. Nat. Revs. Mol. Cell Biol. 11:91–102.

CHAPTER 14 TEXT READINGS Reviews on cell division can be found each year in Curr. Opin. Cell Biol. #6. Alushin, G. & Nogales, E. 2011. Visualizing kinetochore architecture. Curr. Opin. Struct. Biol. 21:661–669. Bloom, K. & Joglekar, A. 2010. Towards building a chromosome segregation machine. Nature 463:446–456. Coller, H. A. 2011. The essence of quiescence. Science 334:1074–1075. Compton, D. A. 2011. Mechanisms of aneuploidy. Curr. Opin. Cell Biol. 23:109–113. Fededa, J. P. & Gerlich, D. W. 2012. Molecular control of animal cell cytokinesis. Nat. Cell Biol. 14:440–447. Handel, M. A. & Schimenti, J. C. 2010. Genetics of mammalian meiosis: regulation, dynamics and impact on fertility. Nat. Revs. Gen. 11:124–136. Joglekar, A. P., et al., 2010. Mechanisms of force generation by end-on kinetochoremicrotubule attachments. Curr. Opin. Cell Biol. 22:57–67. Lampert, F. & Westermann, S. 2011. A blueprint for kinetochores—new insights into the molecular mechanics of cell division. Nat. Revs. Mol. Cell Biol. 12:407–412. Ledbetter, D. H. 2009. Chaos in the embryo. Nat. Med. 15:490–491. [aneuploidy] Malumbres, M. & Barbacid, M. 2009. Cell cycle, CDKs and cancer: a changing paradigm. Nat. Revs. Cancer 9:153–166. Maresca, T. J. & Salmon, E. D. 2010. Welcome to a new kind of tension: translating kinetochore mechanics into a waitanaphase signal. J. Cell Sci. 123:825–835. Nagaoka, S. I., et al., 2012. Human aneuploidy: mechanisms and new insights into an ageold problem. Nat. Revs. Gen. 13:493–504. Neto, H. & Gould, G. W. 2011. The regulation of abscission by multi-protein complexes. J. Cell Sci. 124:3199–3207.

CHAPTER 15 TEXT READINGS Reviews on cell regulation can be found each year in Curr. Opin. Cell Biol. #2. Bollen, M., et al., 2010. The extended PP1 toolkit: designed to create specificity. Trends Biochem. Sci. 35:450–457. Chaudhari, N. & Roper, S. D. 2010. The cell biology of taste. J. Cell Biol. 190:285–296. DeMaria, S. & Ngai, J. 2010. The cell biology of smell. J. Cell Biol. 191:443–452. Foster, K. G. & Fingar, D. C. 2010. mTOR: conducting the cellular signaling symphony. J. Biol. Chem. 285:14071–14077. Good, M. C., et al., 2011. Scaffold proteins: hubs for controlling the flow of cellular information. Science 332:680–686. Happo, L., et al., 2012. BH3-only proteins in apoptosis at a glance. J. Cell Sci. 125:1081–1087. Hunter, T. 2009. Tyrosine phosphorylation: thirty years and counting. Curr. Opin. Cell Biol. 21:140–146. Kenyon, C. 2010. The genetics of ageing. Nature 464:504–512. Kim, T.-H., et al., 2010. Guard cell signal transduction network. Ann. Rev. Plant Biol. 61:561–591. Kobilka, B. K. 2011. Structural insights into adrenergic receptor function and pharmacology. Trends Pharmacol. Sci. 32:213–218. Laplante, M. & Sabatini, D. M. 2012. mTOR signaling in growth control and disease. Cell 149:274–294. Lemmon, M. A. & Schlessinger, J. 2010. Cell signaling by receptor tyrosine kinases. Cell 141:1117–1134. Maxmen, A. 2012. Calorie restriction falters in the long run. Nature 488:569. Mendoza, M. C., et al., 2011. The Ras-ERK and PI3K-mTOR pathways: cross-talk and compensation. Trends Biochem. Sci. 36:320–328.

Miaczynska, M. & Bar-Sagi, D. 2010. Signaling endosomes: seeing is believing. Curr. Opin. Cell Biol. 22:535–540. Parekh, A. B. 2010. Store-operated CRAC channels: function in health and disease. Nature Revs. Drug Disc. 9:399–410. Pearce, L. R., et al., 2010. The nuts and bolts of AGC protein kinases. Nat. Revs. Mol. Cell Biol. 11:9–22. Rosenbaum, D. M., et al., 2009. The structure and function of G-protein-coupled receptors. Nature 459:356–363. Santner, A. & Estelle, M. 2009. Recent advances and emerging trends in plant hormone signalling. Nature 459:1071–1078. Sierra, F., et al., 2009. Prospects for life extension. Ann. Rev. Med. 60:457–469. Tait, S. W. G. & Green, D. R. 2010. Mitochondria and cell death: outer membrane permeabilization and beyond. Nat. Revs. Mol. Cell Biol. 11:621–632. Taubes, G. 2009. Prosperity’s plague. Science 325:256–260. [insulin resistance] Yarmolinsky, D. A., et al., 2009. Common sense about taste: from mammals to insects. Cell 139:234–244. Zoncu, R., et al., 2011. mTOR: from growth signal integration to cancer, diabetes and ageing. Nat. Revs. Mol. Cell Biol. 12:21–35.

CHAPTER 16 TEXT READINGS Reviews on cancer can be found each year in Curr. Opin. Gen. Develop. #1. Bernards, R. 2010. It’s diagnostics, stupid. Cell 141:13–17. Brody, H., et al., 2011. Nature outlook on cancer prevention. Nature 471:3/24, S1–S24. Cairns, R. A., et al., 2011. Regulation of cancer cell metabolism Nat. Revs. Cancer 11:85–95. Collado, M. & Serrano, M. 2010. Senescence in tumours: evidence from mice and humans. Nat. Revs. Cancer 10:51–61. Copeland, N. G. & Jenkins, N. A. 2009. Deciphering the genetic landscape of cancer—from genes to pathways. Trends Gen. 25:455–464. Cowin, P. A., et al., 2010. Profiling the cancer genome. Ann. Rev. Genom. Human Gen. 11:133–159. Dancey, J. E., et al., 2012. The genetic basis for cancer treatment decisions. Cell 148:409–420. Druker, B. J., et al., 2009. 3 reviews on the discovery and development of Gleevec. Nature Med. 15:1149–1161. Farrell, A., et al., 2011. Focus on cancer. Nature Med. 17:#3. Fletcher, O. & Houlston, R. S. 2010. Architecture of inherited susceptibility to common cancer. Nat. Revs. Cancer 10:353–361.

A-6 ADDITIONAL READINGS Gilbertson, R. J. & Graham, T. A. 2012. Cancer: resolving the stem-cell debate. Nature 488:462–463. Grivennikov, S. I., et al., 2010. Immunity, inflammation, and cancer. Cell 140:883–899. Haber, D. A., et al., 2011. The evolving war on cancer. Cell 145:19–24. Hanahan, D. & Weinberg, R. A. 2011. Hallmarks of cancer: The next generation. Cell 144:646–674. Holland, A. J. & Cleveland, D. W. 2009. Boveri revisited: chromosomal instability, aneuploidy and tumorigenesis. Nat. Revs. Mol. Cell Biol. 10:478–487. Huen, M. S. Y., et al., 2010. BRCA1 and its toolbox for the maintenance of genome integrity. Nat. Revs. Mol. Cell Biol. 11:138–148. Kaelin, W. G., Jr. & Thompson, C. B. 2010. Cancer: Clues from cell metabolism. Nature 465:562–564. Kruse, J.-P. & Gu, W. 2009. Modes of p53 regulation. Cell 137:609–622. Kuilman, T. 2010. The essence of senescence. Genes Develop. 24:2463–2479. Lesterhuis, W. J., et al., 2011. Cancer immunotherapy—revisited. Nat. Revs. Drug Discov. 10:591–600. Lujambio, A. & Lowe, S. W. 2012. The microcosmos of cancer. Nature 482:347–355. [miRNA] Marshall, E., et al., 2011. News and reviews on cancer. Science 331:1540–1570. Negrini, S., et al., 2010. Genomic instability— an evolving hallmark of cancer. Nat. Revs. Mol. Cell Biol. 11:220–228. Schilsky, R. L. 2010. Personalized medicine in oncology: the future is now. Nat. Revs. Drug Discov. 9:363–366.

Sellers, W. R. 2011. A blueprint for advancing genetics-based cancer therapy. Cell 147:26–31. Shackleton, M., et al., 2009. Heterogeneity in cancer: cancer stem cells versus clonal evolution. Cell 138:822–829. Venkitaraman, A., et al., 2012. Cancer genomics. Curr. Opin. Cell Biol. #1. Visvader, J. E. 2011. Cells of origin in cancer. Nature 469:314–322. William, W. N., Jr., et al., 2009. Molecular targets for cancer chemoprevention. Nat. Revs. Drug Discov. 8:213–224.

CHAPTER 17 TEXT READINGS The following consist entirely of reviews in Immunology: Advances in Immunology Annual Review of Immunology Critical Reviews in Immunology Current Opinion in Immunology Immunological Reviews Nature Reviews Immunology Trends in Immunology Barreiro, L. B. & Quintana-Murci, L. 2010. From evolutionary genetics to human immunology: how selection shapes host defence genes. Nature Revs. Gen. 11:17–30. Blumberg, R. S., et al., 2012. Focus on autoimmunity. Nature Med. 18:35–70. Chapman, S. J. & Hill, A. V. S. 2012. Human genetic susceptibility to infectious disease. Nature Revs. Gen. 13:175–188. Dolgin, E. 2010. The inverse of immunity. Nature Med. 16:740–743. [autoimmune disease] Erdmann, J. 2009. Lights, camera, infection. Nature 460:568–570.

Flajnik, M. F. & Kasahara, M. 2010. Origin and evolution of the adaptive immune system: genetic events and selective pressures. Nature Revs. Gen. 11:47–59. Geissmann, F., et al., 2010. Development of monocytes, macrophages, and dendritic cells. Science 327:656–662. Herzog, S., et al., 2009. Regulation of B-cell proliferation and differentiation by pre-B-cell receptor signalling. Nat. Revs. Immunol. 9:195–206. Huse, M. 2009. The T-cell-receptor signaling network. J. Cell Sci. 122:1269–1273. Kawai, T. & Akira, S. 2010. The role of pattern-recognition receptors in innate immunity. Nature Immunol. 11:373–384. Kyewski, B. & Peterson, P. 2010. Aire, master of many trades. Cell 140:24–26. Leslie, M. 2009. Internal affairs. Science 326:929–931. [innate immunity] Mathis, D., et al., 2010. Reviews on autoimmunity. Nature Immunol. 11:3–46. Medzhitov, R., et al., 2010. Reviews on inflammation. Cell 140:#6. Mueller, K. L., et al., 2010. Issue on innate immunity. Science Jan. 15. O’Shea, J. J. & Paul, W. E. 2010. Mechanisms underlying lineage commitment and plasticity of helper CD4⫹ T cells. Science 327:1098–1102. Paul, W. E. 2011. Bridging innate and adaptive immunity. Cell 147:1212–1215. Vivier, E., et al., 2011. Innate or adaptive immunity? The example of natural killer cells. Science 331:44–49. Zenewicz, L. A., et al., 2010. Unraveling the genetics of autoimmunity. Cell 140:791–797.

Index aa-tRNAs. See Aminoacyl-tRNAs A band, 366, 366f ABC. See ATP-binding cassette A Aberrations, of resolving power, 734 ABL tyrosine kinase, cancer therapy targeting, 75f, 690 ABO blood group, 130, 416 Abscission, 598–599, 598f Absolute temperature (K), 89, 91 Absorption spectrum, 217 of chlorophyll, 216–217 of photosynthetic pigments, 217f Acetic acid, 39, 39t Acetylcholine (ACh), 168. See also Nicotinic acetylcholine receptor excitatory and inhibitory effects of, 170 Acetylcholine receptors, 169f, 171EP–175EP structure of, 173EP–174EP, 174EPf Acetylcholinesterase, 60, 60f, 170 Acetylcholinesterase inhibitors, 104 Acetyl coenzyme A (Acetyl CoA), 183f in TCA cycle, 185–186, 185f, 186f N-Acetylgalactosamine, 129f N-Acetylglucosamine, 47, 129f, 287f, 293f Acetyl group, 183, 185 Acetylation, of histones 500–501, 527–528 Acids, 39–40, 39t Acid-base pairs, 39 Acid hydrolases, 303–304 Acidophiles, 14 Acinar cells, 273, 274f reprogramming of, 23HP Acquired immune deficiency syndrome (AIDS), 710 resistance to drugs, 108HP RNA interference therapy, 458HP Actin, 358 Actin-ATP monomers (G-actin), 372 Actin-binding proteins, 372–374, 373f functions, 372fn, 373f lamellipodia and, 376–377 Actin-bundling proteins, 373, 373f Actin filaments, 1f, 324f, 325f, 358f adherens junctions and, 257, 258f assembly (polymerization), 358–359, 359f cell motility and, 360, 372f, 374, 375f, 378, 378f in cytokinesis, 599–601, 600f disassembly (depolymerization), 359–360, 378, 378f effects of actin-binding proteins, 372–374, 373f as force-generating mechanism, 374, 375f, 377–378 interactions with myosins, 360, 362–363, 362f, 363f, 369–371, 369f, 371f, 375f, 378, 380f decoration, 375f energetics, 369–370 lever-arm hypothesis, 369, 369f lamellipodia and, 377, 378f molecular motor for, 360–364 in neural tube formation, 382f nucleation, 372, 377f of sarcomeres, 368, 368f organization within cell, 372, 372f properties, 325t in sarcomere, 367f, 369f

size of, 19f in stereocilia of inner ear, 364, 365f “treadmilling,” 359f, 360 Actin filament-depolymerizing proteins, 373, 373f Actin filament-severing proteins, 373–374, 373f Actin monomer-sequestering proteins, 372 Actin-spectrin network of plasma membrane skeleton, 146f, 147 Action potentials, 165–167, 166f propagation as nerve impulse, 167–168, 167f Action spectrum, 21f, 217 Activation energy, 96–97, 96f Activation-induced cytosine deaminase (AID), 715 Activation loop, 638 Active immunotherapy, for cancer, 688–689 Active transport, 148f, 157–161, 158f, 159f coupling to ATP hydrolysis, 157–159, 158f coupling to ion gradients, 160–161, 161f in plants, 161 primary compared with secondary, 160–161 secondary, 161, 161f Acute lymphoblastic leukemia (ALL), gene expression profiles of, 685–686, 686f Acute myeloid leukemia (AML) gene expression profiles of, 685–686, 686f genes associated with, 680–681 Adaptation, of proteins, 76–77, 76f, 77f Adaptive immune responses, 701f, 703–704, 703f Adaptor proteins, RTK interaction with, 638–640, 639f Adaptors on clathrin-coated vesicles, 299–300, 300f, 310, 310f, 320EP Adcetris, 688 Adenine (A), 78, 393f, 394 base pairing, 395f–396f structure, 78f Adenomas, 626HP, 684f Adenosine diphosphate. See ADP Adenosine monophosphate. See AMP Adenosine triphosphate. See ATP Adenoviruses, 24, 24f cancer caused by, 668 Adenylyl cyclase, 632–633, 632f Adherens junctions, 257, 258f Adhesion between cells, 236, 236f, 250–251, 251f, 252f, 259f. See also Cell-adhesion molecules role of cadherins, 253–254, 253f between cells and environment, 244–250, 248f, 260f between cells and substratum, 235f, 236f, 242f, 243, 247–250, 248f, 249f in inflammation and cancer, 255HP–256HP Adhesive junctions, 257, 258f ADP as metabolic regulator, 117 energy source, 199, 201 free energy required, 110 phosphorylation, 111f, 112f, 113, 184f, 186. See also ATP formation uncoupling from oxidation, 198 within ATP synthase, 201

Adrenoleukodystrophy (ALD), 208HP–209HP Adult stem cells, 20HP, 20HPf, 669, 670f Aequorin, 737 Aerobic metabolism, 188HP Aerobic respiration, 178–210, 205f, 515-517 anaerobic ATP formation compared with, 188HP, 667 photosynthesis compared with, 215, 215f Affinity chromatography, 172EP–173EP, 754, 754f African populations, genomes of, and origin of our species, 402, 419HP–420HP Aging cell signaling role in, 647HP–648HP, 647HPf Down syndrome and, 609HP free radicals and, 35HP meiotic nondisjunction, 609HP mitochondrial DNA mutations and, 208HP, 208HPf premature, 208HP, 489f, 490 telomeres and, 508 AIDS. See Acquired immune deficiency syndrome AIRE gene, 721 AKAPs. See PKA-anchoring proteins AKT. See Protein kinase B Alanine (Ala, A), 52f, 53 Albinism, 364 Alcaptonuria, 427 Aldara, 702 Aldoses, 43, 43f Aldotetroses, 44, 44f ALK kinase, cancer therapy targeting, 691 Alleles, 387–388 in genetic recombination, 390–391, 391f incomplete linkage, 390–391 All-or-none law of nerve function, 167 Allosteric modulation of enzymes, 115–117, 116f Allosteric site, 115, 116f Alpha (␣) helix depiction of, 57, 57f folding of, 64, 64f, 65f in integral membrane proteins, 124f, 134, 134f, 134fn, 135f in membrane fusion, 303f in myoglobin, 58, 58f of polypeptide chains, 55, 56f Alpha particles, 749 Alport syndrome, 240 ALS. See Amyotrophic lateral sclerosis Altered peptide ligands (APLs), 726HP Alternative splicing, 454, 534–535, 535f differences between organisms and, 412–413 regulation of, 534–535 Alu repeated DNA sequences, 410–411 Alzheimer’s disease (AD), 66HP–70HP, 609 high-risk alleles, 417HP–418HP, 417HPfn mechanism of, 66HP–68HP prevention and treatment of, 68HP–70HP synaptic plasticity and, 170 Alzhemed, 69HP Amide bonds, 41 Amino acids

I-1

I-2 INDEX activation by aminoacyl-tRNA synthetases, 466–467 in cell signaling, 621 codons for, 462 evolution, 454 genetic code assignments, 463–464, 464f hydrophobic. See Hydrophobic amino acids in IFs, 354 in integral membrane proteins, 135f determining spatial relationships, 136–137, 137f hydrophobicity, 134–135, 135f mating with tRNAs, 465f, 466–468 in proteins, 50 folding and, 65, 65f sequences of evolutionary relationships and, 27EP–28EP, 76 membrane protein orientation and, 286f role of nucleotide sequences, 396, 428 side chains of, 51–54, 51f, 52f, 53f nonpolar, 52f, 53 polar, charged, 51–53, 52f, 53f polar, uncharged, 52f, 53 posttranslational modifications of, 54 with unique properties, 52f, 53–54 stereoisomerism of, 50, 51f structure of, 50–51, 51f substitutions, evolution and, 76, 77f TCA cycle and, 186, 186f water interaction with, 38 Amino acid sequences, nucleotide changes and, 463 Aminoacyl-tRNAs (aa-tRNAs), 464–465, 471, 472f, 473 in initiation of protein synthesis, 468–469 Aminoacyl-tRNA synthetases, 466–467, 467f RNA evolution and, 455 p-Aminobenzoic acid (PABA), 106HP Aminoglycosides, 107HPt Amino groups, 41t of amino acids, 50–51, 51f in alpha helix, 56f in peptide bonds, 51, 51f combining with proton, 39 AMP, as metabolic regulator, 117 AMP-activated protein kinase (AMPK), 117 Amphioxus, and evolution research, 407 Amphipathic lipids, 125 Amphipathic molecules, 47 Amphipathic proteins, 130–132 Amphoteric molecules, 39 AMPK. See AMP-activated protein kinase Amylase, 47 Amyloid, 66HP–67HP, 66HPfn, 67HPf Amyloid ␤-peptide (A␤), 67HP–68HP, 68HPf antibodies for, 69HP Amyloid hypothesis, 67HP–68HP Amyloid plaques, 68HP, 69HP, 69HPf Amyloid precursor protein (APP), 67HP–68HP, 68HPf, 609 as drug target, 69HP Amylopectin, 46–47 Amylose, 46–47 Amyotrophic lateral sclerosis (ALS), 356 Anabolic pathways, 108, 109f, 115, 116f separation from catabolic pathways, 116–117 Anaerobes, 178 Anaerobic ATP formation, 188HP, 667 Anaerobic glycolysis, 113, 667 Anaerobic metabolism, 188HP Anaerobic oxidation, 113–114, 114f Anaphase of meiosis, 607, 607f

of mitosis, 592–597 chromosome movements at, 594–596, 595f events of, 592–594, 594f proteolysis in, 592, 593f spindle assembly checkpoint, 596–597, 596f Anaphase A, 594, 594f, 597 Anaphase B, 594, 594f, 597 Anaphase promoting complex (APC), 592 Aneuploidy, 608HP–609HP of cancer cells, 666–667, 667f Angiogenesis, in cancer, inhibition of, 692–693, 693f Angstrom (A1), 17 Animals, transgenic, 776–778, 777f Animal cells, 8f–9f, 27EPf Animal cloning, 513, 513f Animal fats, 48 Animal models, 68HP, 776 for acetylcholine receptor, 171EP–172EP knockout mice, 778–780, 779f, 779fn Anion, 34 Anion exchangers, 753 Ankyrins, 146f, 147 Antenna pigments, 218–220, 218f, 219f, 220f, 223, 223f Antibiotics development of, 106HP mechanism of action of, 106HP–108HP protein synthesis, 474, 478EP–479EP resistance to, 106HP–108HP, 474 bacteriophage therapy, 26 Antibodies, 50 for amyloid ␤-peptide, 69HP antigen interaction with, 712–713, 713f antigen selection of, 704–705, 704f, 705f cancer therapy using, 688, 693 against cell-adhesion molecules, 255HP, 256HP conformation of, 57 domains of, 712, 712f enzymatic activity of, 97 fluorescent, 737 fluorescent labeling, in cytoskeleton studies, 327, 327f genes encoding B- and T-cell antigen receptors, 713–716, 714f, 715f in immune response, 701f, 703 against integrins, 246–247, 247f molecular biology techniques using, 780–783, 782f molecular structure of, 710–713, 710t, 711f, 711fn, 712f, 713f in pregnancy, 713 against self, 706 TH cell role in formation of, 709–710, 709f Antibody genes, DNA rearrangements, role of transposition, 411 Anticodons, 465f, 466, 466f interactions with codons, 466, 467f, 471f, 472f in polypeptide elongation, 471–474, 472f within ribosomes, 470, 471f, 472f Antigens antibody interaction with, 712–713, 713f antibody selection by, 704–705, 704f, 705f MHC role in presentation of, 727EP–730EP, 727EPt, 728EPt, 729EPf, 730EPf self, 724HP Antigenic determinant, 712 Antigen-presenting cells (APCs), 707–708, 707f, 708f MHC interaction with, 727EP–728EP, 727EPt, 728EPt MHC proteins and, 716–717, 717f

T cell activation by, 722–723, 722f T cell interactions with, 717, 718f, 720–721 Antigen receptors DNA rearrangements producing genes encoding, 713–716, 714f, 715f structure of, 716, 716f Anti-inflammatory drugs, cancer prevention with, 669 Anti-integrin antibodies, 246–247, 247f Anti-myosin II antibodies, 599 Antioxidants, 35HP Antiport, 161 Antisense RNA, 457, 461 Antiserum, 781 AP2 adaptors, 310–311, 312f APC. See Anaphase promoting complex APC gene in cancer development, 678 in colon cancer, 683, 683f, 684f AP endonuclease, in BER, 567 Apical plasma membrane of epithelial cells, 144, 144f Apolipoproteins, 313–314, 314f Apoptosis in cancer cells, 667–677, 667f, 678 p53 role in, 676–677, 676f, 677f, 679f cell signaling in, 656–658, 657f extrinsic pathway, 658–659, 658f intrinsic pathway, 659–660, 659f, 659fn, 660f oncogenes encoding products affecting, 681 Aquaporins, 151, 151f Arabidopsis thaliana, 18f Archaea, 14 as domain, 29EP is studies of replication 556f, 559 locations of, 14 phylogenetic tree, 29EPf Archaebacteria, 14, 28EP eubacterial genes in, 29EP–30EP in eukaryote genome, 29EP–30EP halophilic (salt-loving), amino acid substitutions in, 76, 77f terminology, 29EPfn ARF1 coat protein, 300, 300f Arginine (Arg, R), 51–53, 52f, 53f methylation of 500f Arginine-glycine-aspartic acid (RGD) sequence, 242f, 245t, 247 in stroke and heart attack medications, 247, 247f Arp2/3 complex, 372, 377–378, 377f, 378f Arrestins, 624, 625f Artifacts, in electron microscopy, 742 Arzerra, 688, 726HP Asbestos, cancer caused by, 668 Asparagine (Asn, N), 52f, 53 Aspartic acid (Asp, D), 51–53, 52f, 53f, 100, 204–205, 204f Aspirin, cancer prevention with, 669 Assays, of purified proteins, 752 Association analyses, 417HP–418HP Astral microtubules in metaphase, 590 in prophase, 586–587, 587f Asymmetric carbon atom, 44, 44f Asymmetric cell division, 5f, 574, 604f, 699f Asymmetry (sidedness) of body structure, 349HP, 349HPfn in electron distribution, 37 in membranes, 128–130, 134, 134f, 285, 285f of membrane leaflets, 139, 285 of membrane lipids, 139, 140f Ataxia-telangiectasia (AT), 579 Atherosclerosis, 314, 314f

INDEX I-3 Atoms, 33, 33f electronegative, 34 stabilization of, 35HP Atomic force microscopy (AFM), 201f, 263f, 329, 329f, 363f, 748, 748f ATP, 79 in actin filament assembly, 358–359 in actin-myosin sliding, 369–370, 370f as metabolic regulator, 117 motor protein movement and, 334 ATPases, 158–160, 199, 325t, 358 ATP-binding cassette (ABC), in cystic fibrosis, 162HP ATP-binding cassette (ABC) transporters, 160, 162HP ATP formation, 110–115, 111f, 112f. See also ADP, phosphorylation binding change mechanism, 201–205, 203f in chloroplasts, 225–226 in cyclic photophosphorylation, 226, 226f energy required, 201 energy sources, 189 free energy required, 110 in glycolysis, 183, 184f, 187 indirect route, 112, 112f inhibition of, 198–199 in mitochondrial membrane, 122f, 183f molecular machinery, 199–205, 199f during muscle contraction, 188HP in noncyclic photophosphorylation, 226 oxidative phosphorylation in, 187, 187f proton/ATP ratio, 205 proton movements and, 187, 201, 205 reduced coenzymes in, 186–187, 187f regulation of, 205 role of mitochondria, 189–197 synthesizing machinery, 4f, 5 TCA cycle and, 185f ATP hydrolysis, 92f cellular uses, 93, 93f in coupled reactions, 93 coupling to active transport, 157–159, 158f energetics, 91–92 ATP synthase, 199–201, 200f bacterial compared with mitochondrial, 200–201 catalytic sites, 200, 201–202, 202f binding affinity, 201–202 in chloroplasts, 224f, 225 conformational changes, 202, 203f, 204–205, 204f mitochondrial, 181f rotational catalysis in, 202–204, 203f Attenuated pathogens, in vaccines, 706–707 Attenuation, in lac operon, 486 Attractive forces of atoms, 34 AUG initiation codon, 468–469, 469f, 470 Aurora B kinase, 585, 597, 781f Autoantibodies, 706, 724HP–726HP Autocrine signaling, 618, 618f Autoimmune diseases, 706, 724HP–726HP bullous pemphigoid, 250 pemphigus vulgaris, 257 regulatory T cell role in, 710 systemic lupus erythematosus, 451 T-cell selection role in, 721 therapy for, 725HP–726HP Autolysosome, 305 Autonomous replicating sequences (ARSs), 559 Autophagosome, 305, 305f Autophagy, 304–305, 305f, 648 Autoradiography, 273, 274f, 749, 750f Autosomes abnormal number, 608HP–609HP normal human complement, 608HP

Autotrophs, 211–212 Avastin, 693 Avian erythroblastosis virus, 680 Avian influenza virus, 25 Avogadro’s number, 33fn Avonex, 726HP Axons, 164, 164f intermediate filaments in, 356 microtubules in, 324f, 340 at neuromuscular junction, 370, 371f saltatory conduction, 168f Axonal outgrowth, 356, 379–381, 380f Axonal transport, 333–334, 333f direction, 332, 333f role of microtubules, 332, 334f Axonemal (ciliary, flagellar) dynein, 350–351, 351f Axonemes, 346, 350–351 9 ⫹ 2 array, 346, 347f in Kartagener syndrome, 349HP relation to basal body, 348f sliding of microtubules, 351–352 structure, 346, 347f, 351 B Backbones of DNA, 393f, 394, 395f–396f, 396 of nucleic acids, 77f of phosphoglycerides, 125–126, 125f, 126f Bacteria, 14 autophagy and, 305 cell structure, 8f–9f communities in humans, 15 conjugation, 12, 13f DNA of, 513 DNA replication in, 549 DNA polymerase properties, 550–552, 551f duplex unwinding and strand separation, 549–550, 550f machinery operating at replication fork, 553–554, 554f, 554fn, 555f replication forks and bidirectional replication, 549, 549f semidiscontinuous replication, 552–553, 552f, 553f as domain, 29EP flagella of, 12, 13f free radicals and, 35HP genome complexity, 400, 401f light-driven proton pump, 160, 160f locations of, 14 in lysosomes, 304 metabolism in, 13 operon, 484–487, 484f, 485f lac, 485–487, 486f, 487f trp, 485, 486f as origin of mitochondria, 29EP, 182 phagocytosis of, 270f survival of, 315 photosynthetic, 8, 212, 212f phylogenetic tree, 29EPf propulsion through cytoplasm, 375f size of, 17, 19f terminology, 29EPfn transformation in, 421EP–422EP, 421EPf transposed DNA sequences, 409, 409f virus of. See Bacteriophages Bacterial artificial chromosome (BAC), 774 Bacterial membrane, mitochondrial compared with, 182 Bacterial plasmids, eukaryotic DNA cloning in, 767–768, 767f, 768f

Bacterial toxins, GPCRs and, 627 SNAREs and, 302 Bacteriophages, 422EPf as antibiotic replacement, 26 in early DNA studies, 397f, 422EP–423EP, 423EPf infection by, 25f structure, 24, 24f Bacteriorhodopsin, 160, 160f Banding patterns, on chromosomes, 392, 392f, 503f Bapineuzumab, 69HP Bardet-Biedl syndrome (BBS), 350HP Barr body, 498, 498f Basal bodies, 340, 346–347, 348f in Bardet-Biedl syndrome, 350HP relation to axonemes, 348f Basal lamina. See Basement membrane Bases, 39–40, 39t of nucleic acids, 77–78, 78f, 393–394, 393f terminology, 394fn Base excision repair (BER), 566–567, 566f, 567f Basement membrane (BM), 144, 144f, 236f, 237–238, 237f, 238f in embryonic migration, 242f networks, 240, 241f, 244, 244f tumor cells and, 256HP, 256HPf Base pairs of DNA, 394–396, 395f–396f complementarity, 396–397 numbers in genomes, 412, 412f in repair, 567, 567f of RNA complementarity, 429 nonstandard, 429, 429f Base substitutions, 413fn, 418-419, 462–463, 463f, 464f Basic helix-loop-helix (bHLH) motif, 521, 521f Bax, in apoptosis, 659–660, 659f BAX gene, in cancer development, 676, 679f B cells in autoimmune disease, 724HP, 726HP clonal selection theory applied to, 704–706, 704f, 705f, 706f vaccination, 706–707 DNA rearrangements producing genes encoding antigen receptors of, 713–716, 714f, 715f in immune response, 701f, 703, 703f memory, 705–706 selection of, 721fn TH cell activation of, 722f, 723 TH cell interaction with, 709–710, 709f, 709fn B-cell receptor (BCR), 716, 716f DNA rearrangements producing genes encoding, 713–716, 714f, 715f BCL-2 oncogene, 681, 682f Bcl-2 proteins in apoptosis, 659–660, 659f, 659fn in cancer development, 676 Beadle-Tatum experiment, 427–428, 427f Benign tumors, 670, 683, 684f Benlysta, 726HP BER. See Base excision repair ␤-adrenergic receptors,120f, 617f–618f Beta (␤) barrels, 135, 182, 182f, 316fn ␤ clamp, 554–556, 555f, 556f Beta cells, 644 destruction of, 20HP, 725HP reprogramming of, 23HP Beta particles, 749

I-4 INDEX Beta (␤)-pleated sheet AD and, 67HP–68HP depiction of, 57, 57f folding of, 64, 64f, 65f in myoglobin, 58, 58f of ß-barrel, 182f of immunoglobulin domain, 252f, 711f of polypeptide chains, 56–57, 56f Betaseron, 726HP Bexxar, 688 Biased diffusion across membranes, 317 Bicarbonate ions, in blood buffer system, 40 in cystic fibrosis, 162HP in erythrocyte function, 145 Bicoid gene, 537–538, 538f Binding protein (BiP) in antibody heavy chain synthesis, 80EP–81EP as chaperone for misfolded proteins, 288, 289f in protein synthesis, 282f Biochemicals, 40 Biochemical reactions coupled, 92–93 equilibrium in, 90–91 free-energy changes in, 89–94 standard conditions, 91 Bioenergetics, 87–94 laws of thermodynamics, laws of, 87–89 Bioengineering of proteins, 73–76, 75f Biofilms, 13 Biological molecules carbohydrates, 43–47 linking sugars together, 44–45, 45f, 46f polysaccharides, 45–47, 46f, 47f stereoisomerism, 44, 44f, 45f sugar structures, 43–44, 43f classification by metabolic function, 41–42, 42f functional groups, 41, 41t lipids, 47–49 fats, 47–49, 48f phospholipids, 49, 49f, 125-127, 126f steroids, 49, 49f, see also Cholesterol nucleic acids, 77–79, 77f, 78f, 79f, 393-396 polar compared with nonpolar, 34 properties, 40–42 proteins, 50–65, 50f, 70–77 adaptation and evolution, 76–77, 76f, 77f building blocks of, 50–65, 50f, 70–77 engineering of, 73–76, 74f, 75f interactions of, 61–65, 61f, 62f, 63f proteomics, 70–73, 71f, 72f, 73f structure of, 54–61, 55f, 56f, 57f, 58f, 59f, 60f, 61f types of, 43–79, 43f Biomarkers, 72 for cancer, 693 Biosynthetic (secretory) pathway, 271, 272f, 274f, 296f discovery of, 273 study via mutants, 277–278, 277f 1, 3-Bisphosphoglycerate (BPG), 111f, 112–113, 227f Bivalents (tetrads), 389, 389f, 391f, 606, 606f Bladder cancer, 665f oncogenes in, 697EP Blistering diseases, 250, 257, 356 Blood, buffer system of, 40 Blood-brain barrier, 262, 307 drugs that breach, 726 Blood clotting in atherosclerosis, 314 integrins and, 245, 246–247, 247f preventive medications, 246–247

Blood glucose GPCRs in regulation of, 631–632, 631f cAMP in glucose mobilization, 632, 632f, 633f other aspects of cAMP signal transduction, 632–634, 633f, 633t, 634f, 635f insulin receptors in regulation of, 644 diabetes mellitus and, 646 glucose transport, 646, 646f human longevity and, 647HP–648HP, 647HPf insulin receptors as protein-tyrosine kinases, 644, 644f insulin receptor substrates 1 and 2 in, 644–646, 645f Blood groups, determinants of, 130, 130f, 145, 147 Blood vessels, in cancer, inhibition of formation of, 692–693, 693f Blotting, 763, 763f Bonds, 33–38 covalent, 33–34, 33f double, 34 cis and trans, 48 in fatty acids, 48 glycosidic, 44, 45f, 46f hydrogen, 36, 36f, 37f hydrolysis of, 42f hydrophobic interactions, 36–37, 37f ionic, 35–36, 36f noncovalent, 34–38, 36f in nucleic acids, 77f, 78 triple, 34 van der Waals forces, 37, 38f Bone marrow, in immune response, 700f, 703, 703f Bone marrow transplants, 20HP, 209HP, 307HP, 726HP BRAF oncogene, 680 cancer therapy targeting, 691, 691f Brain AD in, 66HP–70HP aggregates in, in Huntington’s disease, 404HP–405HP blood-brain barrier, 262 CJD in, 66HP gene expression, species differences, 71f, 414 human compared with chimpanzee, 414–415 spongiform encephalopathy in, 66HP stem cells of, 20HP Brain tumors genes associated with, 680–681 immunotherapy for, 689 incidence and mortality of, 665f mutations in, 684 Branch migration during recombination, 610, 610f BRCA1/BRCA2 genes in cancer development, 678, 679f, 682f cancer therapy based on, 692 Breast cancer, 665 abnormal chromosomes of, 667f angiogenesis inhibition in, 693 development of, 669 diet role in, 668, 668f drugs preventing, 669 genes associated with, 673, 673t, 675f, 678 genetics in prognosis of, 687, 687f immunotherapy for, 688 incidence and mortality of, 665f inhibition of cancer-promoting proteins in, 692 mutations in, 685 transcription profiling of, 517, 517f tyrosine phosphorylation in, 620f Bright-field microscope, 735 specimen preparation for, 735

Brownian ratchet, 317 Buffers, 39–40, 39t Bulk-phase endocytosis, 308 ␣-Bungarotoxin, 172EP Burkitt’s lymphoma, 668, 680 C C3 plants, 227 carbohydrate synthesis in, 226–231 C4 pathway, 232 C4 plants, 231f, 232 Ca2⫹-binding proteins, 651–652, 651t, 652f Cadherins, 253–254, 253f, 254f, 257, 259f in adherens junctions, 257, 258f in cancer, 256HP–257HP Caenorhabditis elegans, 18f, 412f, 657 Calcium-induced calcium release (CICR), 649–650, 650f Calcium ion channels, voltage-gated in cell signaling, 649 in synaptic transmission, 169, 169f Calcium (Ca2⫹) ions in cell-cell adhesion, 253, 253f in exocytosis, 303 as intracellular messengers, 648 Ca2⫹-binding proteins, 651–652, 651t, 652f IP3 and voltage-gated calcium ion channels, 649 plant regulation of Ca2⫹ concentration, 652–653, 652f visualizing cytoplasmic Ca2⫹ concentration in real time, 649–651, 649f, 650f, 651f in muscle contraction, 370–371, 371f in photosynthesis, 220f, 221–222 role of mitochondria in, 180, 648fn in smooth endoplasmic reticulum, 280 in synaptic transmission, 169, 169f transport across membranes, 159–160 Calcium pump (Ca2⫹-ATPase), 159–160 Calcium-sensitive fluorophores, 649, 737 Calmodulin, 72–73 Calorie, 33fn Calorie-restricted diets, 647HP–648HP, 647HPf Calvin cycle (Calvin-Benson cycle), 227f, 228–229, 228f, 229f cAMP. See Cyclic adenosine monophosphate CAM plants, 232 cAMP receptor protein (CRP), 487 cAMP response element (CRE), 632–633, 633f cAMP response element-binding protein (CREB), 632, 633f Cancer, 664–698. See also Carcinogens; specific cancers causes of, 667–669, 668f cells of growth properties of, 665–667, 666f, 667f histology of, 607f, 670 mutation of, 669–681, 670f, 671f, 672f, 673t, 674f, 675f, 676f, 677f, 679f, 682f normal cells compared with, 664–667, 666f, 667f genetics of, 669–671, 670f cancer genome, 683–685, 683f, 684f, 685t gene-expression analysis, 685–687, 686f, 687f microRNAs, 681–683 mutator phenotype, 681 oncogenes, 671–672, 671f, 672f, 679–681, 679f, 682f tumor-suppressor genes, 671–679, 671f, 672f, 673t, 674f, 675f, 676f, 677f, 679f incidence and mortality of, 665, 665f, 668f inflammation role in, 668 inheritance of, 664, 669, 673, 675, 678

INDEX I-5 metastatic spread of, 256HP–257HP, 256HPf, 664, 665f mortality rates of, 665, 665f NK cells killing, 702, 702f oncogenes in, 671–672, 671f, 672f, 679, 679f, 682f cancer therapy targeting, 689–692, 690t, 691f cell signaling by, 640–643 discovery of, 694EP–697EP, 694EPf, 694EPt, 695EPf, 696EPf encoding cytoplasmic protein kinases, 680 encoding growth factors or growth factor receptors, 680 encoding metabolic enzymes, 681 encoding products affecting apoptosis, 681 encoding proteins affecting epigenetic state of chromatin, 680–681 encoding transcription factors, 680 proto-oncogene activation into, 671–672, 671f, 672f role of cell-adhesion molecules, 256HP–257HP of skin, DNA repair defects, 568HP–569HP stem cells, 692 therapy for, 687–688 chemotherapy, 665, 677, 687 early detection, 693 gene-expression analysis guiding, 687, 687f immunotherapy, 688–689 inhibition of angiogenesis, 692–693, 693f inhibition of cancer-promoting proteins, 689–692, 690t, 691f preventive measures, 668 radiation, 665, 677, 687 targeted, 687–688 tumor-suppressor genes in, 671–679, 671f, 672f, 673t, 674f APC gene role in FAP, 678 BRCA1/BRCA2 role in breast cancer, 678, 679f PTEN, 678 RB role in cell cycle regulation, 674, 675f TP53 role in cell arrest and apoptosis, 675–677, 675f, 676f, 677f, 679f TP53 role in senescence, 677–678, 679f tyrosine phosphorylation in, 620f Cancer cells, normalized by RNA interference therapy, 458HP Candidate-gene studies, 418HP Cannabinoid (CB1) receptors, 170 Capping, of pre-mRNA, 448, 449f, 450f, 453f enzymes for, 449–450, 449f Capping proteins of actin filaments, 373, 373f in cell locomotion, 378, 378f Capsid of virus, 23–24, 23f, 24f Capsule of bacteria, 8f–9f, 704f Captopril, 104 Carbohydrates, 43–47 energy capture and use, 110–115 functions of, 43 in glycoproteins, 51, 252f, 285–286, 287f, 288f, 292–293 membrane asymmetry and, 285f linking sugars together, 44–45, 45f, 46f in membranes, 124f, 129–130 metabolism, overview, 183f oxidation, 183–185 polysaccharides, 45–47 cellulose, chitin, and glycosaminoglycans, 46f, 47, 47f glycogen and starch, 45–47, 46f stereoisomerism, 44, 44f, 45f sugar structures, 43–44, 43f

synthesis, 212, 214–215, 215f, 227f, 228f, 285-288, 292-293 carbon dioxide fixation and, 226–232 energy needed, 225, 228 Carbon atoms asymmetric, 44, 44f chemical properties of, 40 molecular formations of, 40–41 oxidation states, 109–110, 110f reduction in photosynthesis, 223 Carbon bonds, protein engineering to break, 74 Carbon dioxide (CO2) conversion to carbohydrate, 212, 214–215, 215f, 227f, 228f energy needed, 225 exchange in erythrocyte, 145 fixation, 227, 227f, 228f, 229–231 alternate pathway, 231–232 in C4 plants, 231f, 232 in CAM plants, 232 carbohydrate synthesis and, 226–232 in photorespiration, 229–231, 229f, 230f in photosynthesis, 212, 214–215, 223 in respiration, 185–186 Carbonic acid, in blood buffer system, 40 Carbonyl group, of amino acids, 50–51, 51f in alpha helix, 56f in peptide bonds, 51, 51f Carboxyl group, 41t, 43 Carboxyl-terminal domain (CTD) of RNA polymerase II, 443, 443f, 453, 453f Carcinogens, 667–668 Cardiolipin, 127t, 182 Cargo, in transport, 272, 293, 294f, 296–297, 297f Cargo receptors in transport vesicles, 296, 297f dynamic studies, 320EP Carotenoids, 217, 217f Cartilage cells, 237f, 241f Caspases, 657–660, 659f Caspase activated DNase (CAD), 658 Catabolic pathways, 108, 109f, 186, 186f separation from anabolic pathways, 116–117 Catabolite repression, of lac operon, 486 Catalase, hydrogen peroxide and, 35HP Catalytic constant (kcat, turnover number), 95t, 103 Catenins, 253, 257, 258f Cations, 34 Cation exchangers, 753 CD4 molecules, 717, 718f, 721 CD8 molecules, 717, 718f, 721 CD19, cancer therapy targeting, 689 CD20, therapeutic targeting, 688, 726 Cdc2 kinase, 592, 593f, 614EP Cdh1, 592 Cdks. See Cyclin-dependent kinases cDNAs. See Complementary DNAs Cells. See also Cell division basic properties of, 3–7 cancerous growth properties of, 665–667, 666f, 667f histology of, 607f, 670 mutation of, 669–681, 670f, 671f, 672f, 673t, 674f, 675f, 676f, 677f, 679f, 682f normal cells compared with, 664–667, 666f, 667f chemical reactions of, 6 classes of, 7–19, 8f–9f death of, 3. See also Apoptosis unfolded protein response and, 288 discovery of, 2–3, 3f DNA transfer into, 775–778, 776f, 777f dynamic activities, 374

evolution of, 7 fractionation of, 752, 752f genetic program of, 5 of immune system, 699–700, 699f interactions, 259f with environment, 235–269, 236f, 260f with extracellular materials, 244–250 with other cells, 250–260 internal organization of, 3–5, 4f, 18f role of microtubules, 332 mechanical activities of, 6 multipotent, 20HP organization into tissues, 236f phagocytic, 304 pluripotent, 21HP induced, 22HP–23HP, 22HPf precancerous, 607f, 670 prokaryotic compared with eukaryotic, 8–13, 8f–9f, 10t sizes, 17, 19f structure of, 10–12, 11f, 18f reproduction by, 5, 5f responses to stimuli, 6 self-regulation of, 6–7, 6f, 7f signaling in survival of, 660 sizes of, 17, 19f specialization of, 15–17, 15f, 16f stimuli response of, 6 structure of, 8f–9f, 10–12, 11f, 18f role of lipid bilayer, 127–128 surface area/volume ratio of, 17 telomerase in, 508 Cell-adhesion molecules, 251–254, 259f. See also Adhesion, between cells in cancer, 256HP in inflammation, 255HP–257HP, 255HPf transmembrane signaling and, 259–260 Cell biology techniques, 732–783 AFM, 748, 748f antibodies used in, 780–783, 782f cell cultures, 749–751, 751f chemical synthesis of DNA and RNA, 764 differential centrifugation, 752, 752f DNA libraries, 773 cDNA libraries, 773–775, 775f genomic libraries, 773–774, 775f DNA sequencing, 771–773, 772f DNA transfer into eukaryotic cells and mammalian embryos, 775–778, 776f, 777f genes elimination/silencing knockout mice, 778–780, 779f, 779fn RNA interference, 780, 781f in vitro mutagenesis, 778 isolation, purification, and fractionation of proteins, 752 liquid column chromatography, 753–756, 753f, 754f, 755f PAGE, 756–757, 756f, 757f protein measurement and analysis techniques, 757–758, 758f, 758fn selective precipitation, 752–753 light microscopy, 733, 733f bright-field microscopes, 735 fluorescence microscopy, 736–740, 737f, 738f, 739f, 740f laser scanning confocal microscopy, 739–740, 739f phase-contrast microscopes, 735–736, 736f resolution of, 733–734, 734f super-resolution fluorescence microscopy, 740, 740f

I-6 INDEX Cell biology techniques (continued ) video microscopy and image processing, 738–739 visibility with, 734–735, 735f nucleic acid fractionation, 760 by gel electrophoresis, 760, 761f by ultracentrifugation, 760–762, 762f nucleic acid hybridization, 762–764, 763f PCR, 769–771, 770f radioisotopes, 748–749, 749t, 750f recombinant DNA technology, 764 DNA cloning in, 766–769, 767f, 768f, 769f recombinant DNA formation in, 766, 766f restriction endonucleases in, 764–766, 765f SEM, 740–741, 746–748, 747f specimen preparation for, 747 structure determination of proteins and multisubunit complexes, 758–760, 758f, 759f, 760f TEM, 740–742, 741f, 742f specimen preparation for, 742–746, 743f, 744f, 745f, 746f Cell-cell adhesion. See Adhesion, between cells Cell-cell recognition, 250–251, 251f CellCept, 725HP Cell coat. See Glycocalyx Cell cultures, 749–751, 751f first human culture, 3, 3f Cell cycle, 573–581, 573f, 574f control of, 574–580, 575f checkpoints, Cdk inhibitors, and cellular responses, 579–581, 580f, 581f protein kinases in, 575–579, 575f, 576f, 577f, 579f DNA replication in, 560–561, 560f microtubule dynamics, 341–342, 341f p53 role in arresting, 676, 676f, 677f pRB role in regulating, 674, 675f regulation, role of maturation-promoting factor, 611EP–614EP in vivo, 574 Cell division, 581–609 asymmetric, 574 cytoskeleton in, 325f, 326 eukaryotic compared with prokaryotic, 12 as property of cell, 5 Cell-free systems, 276, 549, 559, 572f, 583, 671, 752 Cell fusion, 141, 141f, 574-575, 575f mobility of membrane proteins and, 141 Cell lines, 508f, 751 Cell locomotion, 1f, 326, 326f, 374–379, 376f, 378f Cell-mediated immunity, 703–704, 707-726 Cell migration in embryo, 242f, 243, 243f, 254, 254f in inflammation, 255HP, 255HPf, 377f Cell plate of plants, 268, 601, 601f Cell replacement therapy, 20HP–23HP adult stem cells for, 20HP direct cell reprogramming, 23HP embryonic stem cells for, 21HP–22HP, 21HPf induced pluripotent stem cells, 22HP–23HP, 22HPf Cell signaling, 617–618, 617f–618f in apoptosis, 656–658, 657f extrinsic pathway, 658–659, 658f intrinsic pathway, 659–660, 659f, 659fn, 660f basic elements of, 618–621, 618f, 619f, 620f Ca2⫹ as intracellular messenger, 648 Ca2⫹-binding proteins, 651–652, 651t, 652f IP3 and voltage-gated calcium ion channels, 649 plant regulation of Ca2⫹ concentration, 652–653, 652f

visualizing cytoplasmic Ca2⫹ concentration in real time, 649–651, 649f, 650f, 651f in cell survival, 660 convergence, divergence, and cross-talk in, 653–655, 653f, 654f extracellular messengers in, 618–621, 619f survey of, 621 GPCRs in, 617f–618f blood glucose regulation, 631–634, 631f, 632f, 633f, 633t, 634f, 635f disorders associated with, 625HP–626HP, 625HPf, 626HPt second messengers of, 621–622, 622f, 622t, 627–630, 627f, 628f, 629f, 630f, 631t in sensory perception, 634–636 signal transduction by, 622–627, 623f, 625f specificity of responses of, 630–631 human longevity and, 647HP–648HP, 647HPf lipid rafts and, 139–140, 140f across membranes, 122, 122f NO as intercellular messenger, 655–656, 655f protein-tyrosine phosphorylation in, 620–621, 620f, 636 downstream signaling processes activated by, 638–640, 639f end of, 640 in insulin receptor signaling, 644–646, 644f, 645f, 646f, 647HP–648HP, 647HPf phosphotyrosine-dependent protein–protein interactions in, 638, 638f in plants, 648 protein kinase activation in, 638 in Ras-MAP kinase pathway, 640–644, 641f, 642f, 643f receptor dimerization in, 636–638, 637f Cell surface receptors, 236, 236f, 239f, 617-660 coated pits and, 308–309, 310f in endocytic pathway, 312, 313f for extracellular ligand uptake, 308, 312 “housekeeping” compared with “signaling” receptors, 312, 313f Cell-surface signals, lymphocyte activation by, 722–723, 722f Cell theory, 2–3 Cellular reproduction, 572–614 cell cycle of, 573–581, 573f, 574f control of, 574–580, 575f in vivo, 574 meiosis, 602–611, 602f, 603f anaphase, 607, 607f genetic recombination during, 610–611, 610f metaphase, 607, 607f prophase, 603–604, 604f, 605f stages, 602–608, 604f telophase, 607f, 608 mitosis, 581–602, 582f anaphase, 592–597, 593f, 594f, 595f, 596f cytokinesis with, 597–602, 598f, 599f, 600f metaphase, 590–592, 591f prometaphase, 588–590, 589f, 590f, 591f prophase, 582f, 583–588, 583f, 584f stages of, 582f telophase, 597, 597f Cellular reprogramming, 23HP, 512, 519, 519f Cellulose, 46f, 47, 47f, 266–268, 266f microfibril synthesis, 267f Cell walls, 266–268, 266f in bacteria, 8f–9f enzymes and, 106HP of eukaryotes, 9

in plants, 8f–9f, 266–268 in cytokinesis, 601–602, 601f of prokaryotes, 9 CENP-A, 509, 585 Centrifugation differential, 275, 276f, 752, 752f of nucleic acids, 760–762, 762f equilibrium centrifugation, 762, 762f velocity sedimentation, 761–762, 762f Centrioles, 8f–9f, 338f, 339, 586–587, 587f Centromeres, 509, 509f, 585, 586f Centrosomes, 338f, 339–340, 339f, 586 duplication of, 586, 587f ␥-tubulin and, 340f, 341 spindle pole formation without, 588, 588f Centrosome cycle, 586, 587f Ceramides, 126, 126f Cerebrosides, 126, 126f Cervical cancer detection of, 670, 670f HPV in, 668 cGMP. See Cyclic guanosine monophosphate Chaperones. See also Molecular chaperones in endoplasmic reticulum, 282f, 283 in mitochondria, 81EP in protein folding, 65, 65f, 80EP–83EP, 287–288, 288f for misfolded proteins, 288, 289f in protein transport, 316–317, 317f, 318f Chaperonins, 65, 81EP–83EP Chargaff ’s rules of DNA base composition, 394 Charge (electric) chemical bonds and, 36f ionization and, 34 voltage and potential compared with, 165, 165f Charge density, protein separation based on, 756 Chase, in pulse-chase experiment, 273 Checkpoints, 579 in cell cycle control, 579–581, 580f, 581f activation of, 579–580 for mitotic spindle assembly, 596–597, 596f Chemical basis of life, 32–85 Chemical energy in carbohydrates, 49 in fats, 49 Chemical “libraries,” in drug design, 75, 75f, 690 Chemical reactions. See also Biochemical reactions of cell, 6 exergonic compared with endergonic, 90 exothermic compared with endothermic, 88 free-energy changes in, 90–92 predicting direction, 90 Chemical synthesis, of DNA and RNA, 764 Chemiosmotic mechanism, 187, 187f Chemoautotrophs, 212 Chemokines, in immune response, 708–709 Chemotherapy, 665, 677, 687 Chiasmata, 606–607, 606f chromosomal, 391f, 609HP Chimeric DNA, 273–274 Chimpanzee’s genome, human’s compared with, 414–415 Chitin, 46f, 47, 47f Chloride ion channels, 162HP, 169, 169f, 652f Chlorophylls, 216–217, 217f, 223, 223f effects of light absorption, 216 of reaction centers, 218–221, 218f structure of, 216f Chloroplasts, 8f–9f, 12, 211f, 212, 213f, 307f cellular roles, 5f

INDEX I-7 cyanobacteria and, 14, 14f endosymbiont origins of, 26EP–27EP, 27EPf, 28EPf energy absorption, 216 enzymes, 228–229, 229f genome of, 212, 214 membranes, 213–214, 213f mitochondria and, 213–215 photosynthesis and, 211–234 presumed origins, 29EP protein uptake, 318, 318f size of, 19f structure and function, 213–214 Cholera toxin, 627 Cholesterol, 40, 40f, 49, 127t familial hypercholesterolemia and, 319EP–320EP “good” form, 314 HMG CoA reductase and, 319EP in lipid bilayer, 127f lipid rafts and, 139, 140f in lipoprotein complexes, 313, 314f longevity and, 314 in membranes, 124f, 127, 129f metabolism of, 313–315 Cholesteryl ester transfer protein (CETP), 314–315 Choline, phosphoglyceride linked to, 126 Chondrocytes, 237f Chromatids, 584, 607, 607f failure to separate, 608HP Chromatin, 11, 11f, 488, 494 DNA replication and, 562–564, 563f folding of, 496 higher structures of, 496–498, 496f, 497f histones and, 527, 527f, 530 loops of, 497, 497f in meiosis, 605, 606f ncRNA and, 461 oncogenes encoding proteins affecting epigenetic state of, 680–681 remodeling of, 528, 529f structure of, coactivators and, 526–530, 527f, 528f, 529f transcription and, 513 Chromatin modification complexes, 530 Chromatin remodeling complexes, 527–528, 528f Chromatography, liquid column, 753 affinity chromatography, 754, 754f gel filtration chromatography, 754, 754f ion-exchange chromatography, 753, 753f techniques for determining protein–protein interactions, 754–756, 754f, 755f Chromodomain, 501 Chromosomal microtubules, 590, 591f, 595f Chromosomes, 388–393. See also Bivalents; Chiasmata accessory, 389 banding patterns, 392, 392f as carriers of genetic information, 389 centromeres, 509, 509f compaction of, 582f, 583, 583f congression of, 589, 589f crossing over, 390–391, 391f “hot spots,” 419HP, 611 in meiosis, 606, 606f unequal, 407, 407f discovery, 388–389 gene locus, 391 genome packaging, 494–498 giant, 392, 392f heterochromatin and euchromatin, 498–499, 498f histone code and heterochromatin formation, 499–501, 500f, 501f, 502f

linkage, 389 marker, 509 in meiosis, 602 in metaphase, 607, 607f in prophase, 603–604, 604f, 605f recombination of, 610–611, 610f mitotic, 497, 497f in anaphase, 592–594, 594f formation of, 583–586 in metaphase, 590–592, 591f in prometaphase, 588–590, 589f, 590f, 591f in prophase, 582f, 583–588, 583f, 584f structure of, 501–502, 503f motor proteins on, 597, 597f movement during anaphase, 594–596, 595f of normal compared with cancer cells, 666–667, 667f nucleosomes, 494–496, 494f, 494t, 495f polytene, 392, 392f prions in, 66HP prokaryotic compared with eukaryotic, 12 reduction division, 389 sex chromosomes, 390f structural variants, 407, 407fn, 416f, 417 synapsis of, 604–605, 605f telomeres, 505–509, 506f, 507f, 508f territories of, 510f Chromosome aberrations deletions, 407, 416f, 417, 505HP duplications of segments, 407, 407f, 407fn, 417, 505HP human disorders and, 504HP–505HP insertions, 416f, 417 inversions, 416f, 417, 504HP nondisjunction, 608HP–609HP, 608HPf Chromosome compaction, 582f, 583, 583f Chromosome condensation. See Chromosome compaction Chromosome number, 388–389, 505HP abnormalities, 608HP–609HP, 608HPf diploid, 406 in gametes compared with zygotes, 388f haploid, 399 normal human complement, 608HP polyploidization, 406–407 Chromosome puffs, 392, 392f Chromosome walking, 774 Chronic lymphocytic leukemia (CLL), immunotherapy for, 688–689 Chronic myelogenous leukemia (CML) developing medications for, 75, 75f inhibition of cancer-promoting proteins in, 690 Cilia, 345–353. See also Axonemes basal bodies and, 340, 346–347, 348f beating pattern, 345, 346f in development and disease, 349HP–350HP of embryonic node, 349HPfn flagella compared with, 345 locomotion mechanism, 351–352, 352f role of dynein, 350–351, 351f sliding-microtubule theory, 353, 353f nonmotile, 345 size of, 19f structure, 346, 347f Ciliary (axonemal, flagellar) dynein, 350–351, 351f Ciliopathies, 350HP Cimzia, 726HP Ciprofloxacin, 106HP Circulating tumor cells (CTCs), 256HP, 256HPf Cis double bonds, 48 Cis-Golgi network (CGN), 290, 291f, 296f

Cisternae, 270 of Golgi complex, 272f, 290, 291f, 296f of rough endoplasmic reticulum, 279, 279f Cisternal maturation model, 293–294, 294f Clamp loader during DNA synthesis, 556, 556f, 562 Class switching, Ig heavy chains, 715–716, 715f Clathrin, 308, 309f, 319EP–321EP, 321EPf Clathrin-coated vesicles, 295, 296f, 300f, 308-311 accessory proteins, 311 dynamic studies, 320EP–321EP, 321EPf early studies, 319EP–320EP in endocytosis, 308–311, 309f, 310f, 311f formation, 300, 300f, 309f myosins, 364 structure, 299–300, 308–311, 310f transport of lysosomal enzymes, 299–300, 299f Clathrin-mediated endocytosis, 308-311, 320EP–321EP, 321EPf Claudins, 262, 262f Clinical trials, 75f Clonal selection theory, applied to B cells, 704–706, 704f, 705f, 706f vaccination, 706–707 Cloning of animals, 513, 513f DNA, 766–769, 767f, 768f, 769f eukaryotic DNAs in bacterial plasmids, 767–768, 767f, 768f eukaryotic DNAs in phage genomes, 768–769, 769f PCR in, 770–771 specialized vectors for larger fragments, 774 of plants, 512–513 Clots. See Blood clotting Coactivators, 526 basal transcription machinery and, 526 chromatin structure and, 526–530, 527f, 528f, 529f Coated pits, 308–311, 309f studies of, 319EP–321EP, 320EPf, 321EPf Coated vesicles, 295–298, 295f, 296f, 310f formation of, 296f, 308, 309f, 311f phosphoinositides in, 311–312, 312f loss of coat, 321EP, 321EPf in receptor-mediated endocytosis, 308–311 studies of, 319EP–321EP, 319EPf, 321EPf Coat proteins in vesicle budding, 276, 277f of vesicles, 296–297, 296f, 300 Cocaine, 170 Cockayne syndrome (CS), 568HP Codons, 462 amino acids specified, 463–464, 464f in nuclear compared with mitochondrial mRNA, 464 decoding by tRNAs, 464–468 interactions with anticodons, 466, 467f, 471f, 472f mutations in, 464f in polypeptide synthesis, 471–474, 472f termination signal, 463 third-nucleotide variability, 464, 466 wobble hypothesis, 466, 467f Coenzymes, 95 in fatty acid cycle, 186f reduced ATP formation and, 186–187 in electron-transport chain, 187 energy from, 187 in fatty acid cycle, 186f in oxidative phosphorylation, 186, 187f in TCA cycle, 185f, 186 in TCA cycle, 185f, 186

I-8 INDEX Coenzyme A, 183, 185, 186f Coenzyme Q. See Ubiquinone Cofactors for enzymes, 95 in photosystem I, 223 Cofilin, 373–374, 378, 378f Cohesin, 497, 497f, 584–585, 584f, 585f in meiosis, 605, 606f Coley’s toxin, 688 Colicins, 478EP–479EP Colitis, 459HP, 725 Collagens, 238–240, 239f abnormalities, 240 of basement membrane, 241f, 244, 244f size of, 17 Collagen genes, mutations, 240 Colon cancer, 665 angiogenesis inhibition in, 693 development of, 669 diet role in, 668, 668f DNA repair defects and DNA repair defects and, 569HP drugs preventing, 669 FAP, 673t, 678 genes associated with, 673t, 675f, 678 immunotherapy for, 688 incidence and mortality of, 665f inflammation role in, 668 mutations in, 683–684, 683f, 684f Communication between cells, 236 role of plasma membrane, 122, 122f and their environment, 236 via desmotubules, 265 via gap junctions, 262–264, 263f via tunneling nanotubes, 264, 264f Comparative genomics, 413–414 Compartmentalization by membranes, 121, 122f Competitive enzyme inhibitors, 104, 105f Complement, in immune response, 701f, 702 Complementarity of DNA, 396–397 of mRNA, 428 of nucleic acids, 400 of RNAs, 429f of tRNAs, 465–466, 465f Complementarity-determining regions (CDRs), 730EP, 730EPf Complementary base-pairing. See Base-pairing Complementary DNAs (cDNAs), 446, 446f libraries, 773–775, 775f Complexity of cell, 3–5, 4f Concentration gradients, 147–148, 148f, 161 separation of charge compared with, 198 role in ATP synthesis, 187, 198-206 Condenser lens, 733, 733f of TEMs, 741, 742f Condensin, 583, 584f, 585f Conditional knockouts, 780 Conductance, of ions, 151, 152f Confocal microscopes, 739–740, 739f Conformation of molecule, 55 native, of polypeptides, 428f of polypeptide chains, 55–57, 56f, 57f Conformational changes, 60, 115 allosteric modulation, 115–116, 116f covalent modification, 115 in enzymes, 100–102, 101f, 115 within integral membrane proteins, 135–137, 136f, 137f rotational catalysis, 202–204, 203f

Congenital agammaglobulinemia, 704 Congenital diseases of glycosylation (CDG), 286–287 Congression, of chromosomes, 589, 589f Conjugate acid, 39 Conjugate base, 39 Conjugation in bacteria, 12, 13f Connective tissue, 236 Connexins, 262–263, 263f Connexons, 262–263, 263f Consensus sequences, 433, 433f, 442, 450f Conservative replication, 546–547, 546f Conserved sequences, 413–414, 413f in actin, 358 of amino acids in proteins, 76 in ATP synthase, 200–201 consensus sequences compared with, 433 microRNAs, 459 natural selection and, 413fn of nucleotides, in pre-mRNA splice sites, 450 in photosynthetic reaction centers, 219 in potassium ion channels, 153 in rRNAs, 479EP Constitutive heterochromatin, 498 Contractile ring theory, 599, 599f Contractility of muscle, 364–371 nonmuscular, 374–381 Contrast, in microscopy, 735 Convergence, among signaling pathways, 653–655, 653f, 654f Copaxone, 726HP COPI-coated vesicles, 295, 295f, 296f, 298 retrieval of proteins, 298f COPII-coated vesicles, 295–298, 295f, 296f Coprecipitation, of proteins, 755 “Copy-and-paste” genetic transposition, 410f Copy number polymorphisms, 415f, 417 Corepressor, 485, 530 Core promoter, 522 Core promoter element, 442f Corneal stroma, 240, 240f Cortex (cortical zone) of cell, 372 in cytokinesis, 599, 599f, 600f Corticosteroids, 725HP Costimulatory signal, in lymphocyte activation, 722 Cotranslational protein transport, 281, 281fn, 282f Cotransport, 160–161, 161f Cotton, 47 Coupled reactions, 92–93 Couples, acid-base pairs, 39 Coupling factor 1 (F1) of ATP synthase, 199, 200f Covalent bonds, 33–34, 33f CREB. See cAMP response element-binding protein Creutzfeldt-Jakob disease (CJD), 66HP Criminal investigations, use of DNA, 402, 402f c ring of ATP synthase, 201, 201f, 204–205, 204f Cristae, 179f, 180, 181f Cristal membrane, 180, 181f Critical-point drying, for SEM, 747 Crohn’s disease, 459HP, 725HP Cross-bridges between cytoskeletal elements, 333f, 354, 354f of sarcomere, 366, 367f, 368f, 369 energetics, 369–370 rigor mortis and, 370 Cross-linking proteins, 333f, 354, 354f, 373, 373f Cross-talk, among signaling pathways, 653–655, 653f, 654f CRP. See cAMP receptor protein Cryoelectron tomography (Cryo-ET), 12f, 744 Cryofixation, for TEM, 743–746

Crystallography, X-ray, 57-58, 100-101, 133-134, 623, 758–760, 758f, 759f, 760f C-terminal domain. See Carboxyl-terminal domain C-terminus, of polypeptide chain, 51 CTLA-4 therapeutic targeting, 688, 726 in immune response, 723 CTLs. See Cytotoxic T lymphocytes Cultures. See Cell cultures “Cut-and-paste” genetic transposition, 409–410, 409f, 410f Cuvette, 757 Cyanobacteria (blue-green algae), 8, 14, 14f, 212 as chloroplast ancestors, 212 chloroplasts and, 14, 14f, 27EP, 27EPf, 28EPf Cycle sequencing, 771–772, 772f Cyclic adenosine monophosphate (cAMP), 627, 627f in glucose mobilization, 632, 632f, 633f in lac operon, 487 other aspects of, 632–634, 633f, 633t, 634f, 635f Cyclic carbon backbones, 40 of sugars, 43–44, 43f Cyclic guanosine monophosphate (cGMP), 633, 634, 655f, 656 Cyclic nucleotides. See Cyclic adenosine monophosphate; Cyclic guanosine monophosphate Cyclic photophosphorylation, 226, 226f Cyclin-dependent kinases (Cdks), 575–576 inhibitors of, 577 in cell cycle control, 579–581, 580f, 581f in cell differentiation, 581, 581f oncogenes encoding, 680, 682f phosphorylation/dephosphorylation of, 576f, 577, 577f Cyclins, 576-579 fertilization and, 613EP, 613EPf relation to cdc2 protein kinase, 614EP relation to maturation-promoting factor, 613EP, 614EP Cyclooxygenase-2, in cancer, 669 Cyclosporin A, 716, 725HP Cys. See Cysteine Cysteine (Cys, C), 52f, 53–54, 135 Cystic fibrosis (CF), 162HP, 288, 475 Cystic fibrosis transmembrane conductance regulator (CFTR), 162HP–163HP Cytochromes, 191, 193f Cytochrome b, 222, 223f, 224f Cytochrome bc1, 194f, 196, 222 Cytochrome c, 54, 54f, 193, 193f, 659f, 660 Cytochrome oxidase (cytochrome c oxidase), 195f, 196–197, 197f Cytochrome P450 enzymes, 280, 419HP Cytokines in autoimmunity, 726HP in immune response, 377f, 708–709, 709t, 723–724 Cytokinesis, 573, 583, 597–602, 599f, 600f abscission, 598–599, 598f midbody during, 598, 598f myosin in, 599–600, 600f in plant cells, 601–602, 601f Cytoplasm, 12, 12f, 271f Cytoplasmic dyneins, 337–339, 337f direction of movement, 337f, 338 Golgi complex and, 337f In intraflagellar transport, 350f structure, 337–338 Cytoplasmic membrane systems, 122f, 270–323 Cytosine (C), 78, 78f, 393f, 394

INDEX I-9 base pairing, 395f–396f DNA repair and, 567 Cytoskeleton, 11f, 12f, 179f, 324–383 adherens junctions and, 257, 258f in AFM assay, 329 approaches to study, 326–330 caspases targeting, 658 components, 324, 325f dynamic nature, 326f, 345 FRAP study of, 329–330, 330f functions, 325–326, 325f motor proteins and, 334–339 origin of, 27EP in prokaryotes, 324–325 prokaryotic compared with eukaryotic, 10–12 signal transmission and, 257 Cytosol of animal cell, 8f–9f of eukaryotic cell, 11f, 12 molecular chaperones in, 65 of plant cell, 8f–9f Cytosolic space, of endoplasmic reticulum, 279, 279f Cytotoxic T lymphocytes (CTLs), 709 APC interactions with, 718f, 720 development of, 721 D Dalton, 24fn Daptomycin, 107HP Dark (light-independent) photosynthetic reactions, 216, 227f, 228f, 229 Darwinian selection, 413fn Davson-Danielli model, 124, 124f DCVax, 689 Deacylated tRNA, release of, 472f, 473–474 Death of cells, 3, 288. See also Apoptosis rigor mortis, role of sarcomere cross-bridges, 370 Deep-etching techniques, for TEM, 746, 746f Defensins, 702 Degeneracy of genetic code, 463, 464f Degron, 542 Dehydrogenases, 112, 190–191 Deletion mapping, 523 Dementia. See also Alzheimer’s disease FTDP-17 inherited, and tau microtubuleassociated protein, 331–332 with NFTs, 70HP Denaturation of DNA, 399–400, 400f of proteins, 63 Dendrites, 164, 164f, 649f Dendritic cells (DCs), 700, 701f, 707–708, 707f, 708f T cell activation by, 722f In cancer vaccines, 689 Deoxyribonucleic acid. See DNA Deoxyribonucleoside terminology, 394fn Deoxyribose, 77f, 393–394, 394fn, 420EP Depactin, 373 Depolarization, 166, 166f in nerve impulse, 167, 167f, 169, 169f voltage-gated ion channels and, 155 Depolymerizing kinesins (depolymerases, kinesin13), 336, 595–596, 595f Dermis, 236, 236f Desensitization, of receptor signaling, 624–625, 625f Desmin-related myopathy, 356 Desmosomes (maculae adherens), 257, 258f, 259f Detection, of cancer, 693 Detergents, solubilization of membrane proteins with, 132–133, 133f

Detoxifying enzymes, 280 D genes, 715–716 DHA. See Docosahexaenoic acid Diabetes. See also Type 1 diabetes Diabetes mellitus basement membrane of kidney and, 237–238 insulin receptor signaling and, 646 Diacylglycerol (DAG), 49, 49f, 629f, 630 Diakinesis, 607 Dicer ribonuclease (Dicer endonuclease), 456f, 457, 460 Dictyosome, 290fn Diet calorie-restricted, 647HP–648HP, 647HPf cancer and, 668, 668f Differential centrifugation, 276f, 752, 752f Differential interference contrast (DIC), 736, 736f Differentiated cells, 16 for cell replacement therapy, 21HP, 21HPf reprogramming of, 23HP, 512-513, 519 in tumor development, 669, 670f Diffusion, 147–148 in cells, 17 across membranes, 148–156, 317 energetics, 148 facilitated, 156–157, 156f, 157f of ions, 151–156 of nonelectrolytes, 149 of solutes, 147–149, 148f, 151–157 tight junctions and, 260–261 of water, 149–151, 150f Diffusion coefficient, 142, 142f Diglycerides, 125 Dihydroxyacetone phosphate, 111f 2, 4-Dinitrophenol (DNP), 198–199 Diphosphatidylglycerol, 182 Diploid state, 406 Diplotene, 606–607, 606f Dipoles, 34, 37 Direct cell reprogramming, 23HP, 519 Direct immunofluorescence, 783 Disaccharides, 45, 45f Dispersive replication, 546f, 547 Dissociation, 39 of water, 40 Distal promoter elements, 525 Disulfide bridge with cysteine, 53 mercaptoethanol and, 63 Divergence, among signaling pathways, 653–655, 653f, 654f DNA, 77 in bacterial conjugation, 12, 13f base composition, 393–394. See also Base pairs Chargaff ’s rules, 394 structural restrictions, 396 tetranucleotide theory, 394 catenated, 398, 399f chemical synthesis of, 764 chimeric, for fluorescent labeling studies, 273–274 of chloroplasts, 214 circular, 397f, 398 cloning of, 766–769, 767f, 768f, 769f eukaryotic DNAs in bacterial plasmids, 767–768, 767f, 768f eukaryotic DNAs in phage genomes, 768–769, 769f PCR in, 770–771 specialized vectors for larger fragments, 774 comparing molecules of, 771 complementarity, 396

conformational changes, during transcription, 442, 443f damage to, 679f cancer, 569HP from free radicals, 35HP p53 role in, 675–677, 675f, 676f, 677f, 679f denaturation, 399–400, 400f double helix, 395f–396f Watson-Crick model, 394–396 early studies, 420EP–423EP effects of topoisomerases, 398, 399f eukaryotic compared with prokaryotic, 10 evolution of, 455 Feulgen staining of, 735, 735f as genetic material discovery of, 420EP–423EP functions, 396–397, 397f hydrogen bonds in, 36, 36f as information storehouse, 396, 397f, 398–399, 428, 428f intergenic, 413 ionic bonds in, 35–36 libraries, 773 cDNA libraries, 773–775, 775f genomic libraries, 773–774, 775f linker, 495 melting, 400, 400f mitochondrial, 181f, 182 mutations in, 207HP–208HP, 208HPf nuclear compared with, 208HP naked, 494 nucleosomes in, 529–530, 529f overwound (positively supercoiled), 398, 430f PCR for amplification of, 769–771, 770f polarity, 77f protein engineering with, 73 quantification of, 771 from radiation, 568HP–569HP recombinant, 764 DNA cloning and, 766–769, 767f, 768f, 769f formation of, 766, 766f restriction endonucleases and, 764–766, 765f regulatory regions, and transcription factors, 444 relaxed, 397f, 398 renaturation (reannealing), 400, 401f in eukaryotes, 401f repair of, 545, 564–568, 565f base excision repair, 566–567, 566f, 567f deficiencies in, 568HP–569HP double-strand breakage repair, 567–568, 568f mismatch repair, 567 mismatch repair defects, 568HP–569HP mutant genes involved in, 681 nucleotide excision repair, 565, 566f p53 role in, 675–677, 675f, 676f, 677f XP and, 568–569 replication of, 396, 545–546, 545f in bacterial cells, 549–554, 549f, 550f, 551f, 552f, 553f, 554f, 554fn, 555f DNA polymerase structure and function, 554–558, 555f, 556f, 557f, 558f in eukaryotic cells, 558–564, 559f, 560f, 561t, 562f, 563f histones and, 509–510 semiconservative nature of, 546–549, 546f, 547f, 548f Watson-Crick model of, 396–397, 546, 546f ␣-satellite, 509 separation of, by gel electrophoresis, 760, 761f sequencing of, 771–773, 772f single-stranded

I-10 INDEX DNA (continued ) man-made, 403, 403f in transcription, 430f strands, 394 antiparallel nature, 396 complementarity, 396–397 polarity, 394 structure, 393, 393f, 395f–396f 5’ and 3’ ends, 393f, 394 backbone, 393f, 394, 395f–396f, 396 base stacking, 395f–396f, 396 major and minor grooves, 395f–396f, 396 terminology, 394fn Watson-Crick model, 386f, 394–396, 395f–396f X-ray diffraction pattern, 386f supercoiled, 397–398, 397f, 430f testing for specific sequences of, 771 tetranucleotide theory, 394, 420EP total amount, gene number compared with, 412f, 412fn in transcription, 429–431, 430f, 476f conformational changes, 442, 443f unwinding (strand separation), 430f, 431–432, 432f, 442–443 transcriptional control and, 522–525, 522f, 523f transcription unit, 434 transfer into eukaryotic cells and mammalian embryos, 775–778, 776f, 777f as transforming principle, 422EP underwound (negatively supercoiled), 398, 398f, 430f upstream and downstream strands, 430f viruses in study of, 26 DNA-binding proteins, grooves in DNA and, 396 DNA delivery systems, for gene therapy, 163HP DNA fingerprinting, 402, 402f DNA footprinting, 523 DNA glycosylase, 566–567, 566f DNA gyrase, 106HP, 550 DNA helicases, 442–443, 545f, 553–554, 554f DNA ligase, 552–553 DNA methylation, 530f, 531–532, 531f DNA methyltransferase, 531 DNA microarrays (“DNA chips”), 514–517, 515f, 516f cancer gene expression profiling with, 620f, 685–687, 686f, 687f DNA polymerases, 550–552, 551f, 554, 554fn, 555f in BER, 567 in eukaryotes, 561–562, 562f structure and function of, 554–556, 555f ensuring fidelity, 557–558, 557f, 558f exonuclease activities, 556–558, 557f, 558f DNA polymerase III holoenzyme, 554, 555f DNA rearrangements, of genes encoding B- and T-cell antigen receptors, 713–716, 714f, 715f DNA-RNA hybrids introns and, 446–448, 447f, 448f during nucleic acid hybridization, 762-764 during transcription, 430f, 431 “DNA-RNA-protein world,” 454 DNA sequences changes in, synonymous, 464, 464f chromosomal compared with genetic functions, 413–414 conserved compared with not conserved, 413–414, 413fn copy number polymorphisms, 417 in criminal investigations, 402, 402f duplications and modifications, 407–408, 407f evolutionary changes, 407–411, 413–414 “fingerprinting” with, 402f

human compared with chimpanzee, 414–415 “meaningful” compared with “junk,” 461 noncoding, 403, 408, 413 genome size and, 412fn nonrepeated (single-copy), 401, 401f, 405–406, 405f polymorphic, 402 repeated, 401–403, 401f coding compared with noncoding, 403 dispersion of, 408–411 genome size and, 412fn highly repeated, 401–403, 401f, 412 inverted repeats, 409, 410f moderately repeated, 401, 401f, 403, 410 operation of, 407, 407f, 410–411 terminal repeats, 409, 410f trinucleotide repeats, 404HP–405HP use in identifying persons, 402f satellites, 401–402 locating, 402–403, 403f “useless,” 414 variations in humans, 416 DNA topoisomerases, 398, 399f DNA transposons, 410, 410f DNA tumor viruses, 668, 674 DNMT3A oncogene, 680–681 Docking proteins, 302 RTK interaction with, 639f, 640 Docosahexaenoic acid (DHA), 126 Dolichol phosphate, 286, 287f Domains of prokaryotes, 14 of proteins, 59, 59f evolutionary shuffling, 411 taxonomic, 29EP of transmembrane proteins, 135f Dominant alleles, 387–388 Dopamine-producing neurons, 20HP Dopamine reuptake, 170 Dopamine transporter (DAT), 170 Double-blind trials, 69HP Double bonds, 34 cis and trans, 48 in fatty acids, 48 and saturation state, 126 Double-strand breaks (DSBs), 567–568, 568f, 580, 580f, 604, 610 Double-stranded RNA (dsRNA), 455–457, 470 Doublets, of axoneme, 346, 347f in microtubule sliding, 351–352, 351f, 352f, 353f Down syndrome, 609HP karyotype, 609HPf Drosha endonuclease, 456f Drosophila melanogaster, 18f, 390f chromosomes, 390f genome size of, 412 in genetic research, 390 polytene chromosomes of, 392 protein–protein interactions in, 62 DSBs. See Double-strand breaks DsRed, 737 Dual-label fluorescence, 321f, 732f, 737–738 Dwarfism, and DNA repair defects, 568HP Dyes with light microscopes, 735, 735f tracking, 756 Dynactin, 338 Dynamic instability of microtubules, 344–345, 344f Dynamin, 311, 311f Dyneins ciliary (axonemal, flagellar), 350–351 conformational changes, 351, 351f

cytoplasmic, 337–339, 337f direction of movement, 337f, 338 Golgi complex and, 337f In intraflagellar transport, 350f structure, 337–338, 337f Dynein arms of axoneme, 347f, 350–351, 351f conformational changes, 351f in microtubule sliding, 351–352, 351f, 352f E E2F proteins, RB gene interaction with, 674, 675f, 682f Early endosomes, 312, 313f Ectodermal cell, 237f Effectors, in cell signaling, 619 EGFR, 636 cancer therapy targeting, 688, 691 oncogenes encoding, 680, 682f Ehlers-Danlos syndromes, 240 Eicosanoids, in cell signaling, 621 Eicosapentaenoic acid (EPA), 126 melting point of, 139t Electric charge. See Charge Electric potential. 148 charge and voltage compared with, 165, 165f Electrochemical gradients, 148, 198 Electrolytes, 148 Electrons arrangements in atoms, 33f free radicals and, 35HP asymmetric distribution, 37 atomic arrangement of, 33, 33f excited, 216 photosynthetic unit and, 216 high- vs. low-energy, 183fn ionization and, 34 in photosynthesis high- vs. low-energy, 183fn, 212, 215f from water, 220–221, 225 Electron acceptors, in photosynthesis, 219, 220f, 223, 223f Electron carriers, 191–197, 194f of electron-transport chain, 186, 187f, 192–193, 193f determining their sequence, 192, 193f redox potential, 193f as herbicide targets, 225 in photosynthesis in photosystem I, 223 in photosystem II, 222 in Z scheme, 219 structures, 191f Electron cryomicroscopy (Cryo-EM), 79f, 81f, 310f, 453f, 759 Electron crystallography, 173EP, 174EPf, 760 Electron density maps of acetylcholine receptor, 174EPf of enzyme active site, 98f of hydrogen bond, 101f of diketopiperazine, 759f Electron donors, 187, 187f, 189–190, 193 during photosynthesis, 214–215, 219–223 Electronegative atoms, 34 Electron microscopes, See Scanning electron microscopy; Transmission electron microscopy Electron orbitals, 33f Electron paramagnetic resonance (EPR) spectroscopy, use in membrane protein studies, 136–137, 137f Electron shells, 33, 33f Electron transfer enzyme actions and, 99

INDEX I-11 into mitochondria, 186 in photosynthetic units, 218, 218f exergonic nature, 223 redox potentials and, 190 in redox reactions, 109–110, 111f, 112–113, 112f reducing power and, 114–115 in TCA cycle, 185f, 190 via cytochrome oxidase, 196, 197f via electron-transport chain, energy from, 187 via NADH, 187, 187f, 190 Electron transfer potential, 189–190 Electron transport, 191–197, 194f in cyclic photophosphorylation, 226 during oxygenic photosynthesis, 219, 219f, 223–225, 224f from Mn-Ca cluster to P680⫹, 222 in photosystem I, 223, 223f in photosystem II, 220f from photosystem II to photosystem I, 222–223, 222f from photosystem II to plastoquinone, 220–221 from water to photosystem II, 221–222 weed killers and, 225 Electron-transport chain, 112, 186–187, 190–197, 194f electron carriers, 186–187, 187f, 192–193, 193f determining their sequence, 192, 193f proton pumps and, 193 Electron-transport complexes in respiration, 193–197, 194f mammalian vs. bacterial, 194f Electron-tunneling pathways, 193, 193f Electrophoresis DNA separation by, 397f, 402f, 760, 761f, 765f RNA separation by, 454f PAGE, 756–757, 756f SDS–PAGE, 146f, 757 two-dimensional, 71f, 757, 757f Electroporation, 776 Electrospray ionization (ESI), 758 Electrostatic attraction, 37 Elimination, of genes knockout mice, 778–780, 779f, 779fn RNA interference, 780, 781f in vitro mutagenesis, 778 Elongation in protein synthesis, 471–474, 472f in transcription, 430f, 431 coordination with polyadenylation and splicing, 453f RNA polymerase in, 443–444 RNA polymerase phosphorylation during, 443f Elongation factors, 443f, 471–472, 471f, 472f Embryonic development axon growth, 381 basal bodies in, 350HP basic body plan, 349HP cadherins in, 254, 254f cell-adhesion molecules in, 253, 254 cell-cell recognition, 250–251, 251f cell differentiation in, 16 cell migration, 242f, 243, 243f changes in cell shape, 381, 382f fibronectin in, 242f interactions between cells and, 250–251 laminins in, 243–244, 243f microRNAs in, 459–460, 459f of nervous system, 381, 382f neural crest cell migration, 356 organ formation, 242f, 243 role of cilia in, 349HP–350HP situs inversus, 349HP

Embryonic node, 349HP, 349HPfn Embryonic stem (ES) cells, 21HP–22HP, 21HPf, 779 differentiation of, 21HP, 21HPf iPS cells compared with, 22HP–23HP SCNT of, 21HP–22HP, 21HPf transcription factors and, 519 undifferentiated, 21HP Embryos, DNA transfer into, 775–778, 776f, 777f Empty magnification, 734, 734f EMT. See Epithelial-mesenchymal transition Enantiomers. See Stereoisomerism Enbrel, 726HP End-blocking (capping) proteins, 373, 373f in cell locomotion, 378, 378f Endergonic processes, 90 ATP hydrolysis and, 92–93 Endergonic reactions, coupling to exergonic reactions, 92–93 Endocannabinoids, 170 Endocrine signaling, 618, 618f Endocytic pathway, 312-315 dynamic studies, 321EP, 321EPf early studies, 321EP Endocytosis, 272f, 308–315. See also Receptormediated endocytosis bulk-phase compared with receptor-mediated, 308 clathrin-mediated, 321EP, 321EPf phosphoinositides in, 311–312 Endomembrane systems, 271, 272f approaches to study, 273–278 conservation of cellular processes, 277–278 conserved nature, 277–278 overview of, 271–273 Endonucleases, restriction, 402f, 764–766, 765f Endoplasmic reticulum (ER), 8f–9f, 11f, 12, 272f, 277f, 279–289 cisternal space (lumen), 279, 279f calcium storage, 280, 649-652 protein processing in, 283 retrieval of “escaped” proteins, 298 mitochondrial fission, 179, 180f origin of, 27EP, 27EPf partitioning during mitosis, 588 processing of new proteins, 283 quality control screening for aberrant proteins, 288 secretory protein dynamics and, 274f stress-reducing measures, 288, 289f Endoplasmic reticulum Golgi intermediate compartment (ERGIC), 289, 296f Endosomes, 272f dynein transport of, 338 in endocytic pathway, 312, 313f in vesicular transport, 296f as signaling platforms, 624, 625f Endosymbionts, 26EP, 29EP Endosymbiont theory, 26EP, 27EPf, 179, 182, 212 Endothelial cells, 235f, 255 Endothermic reactions, 88 End-replication problem, 505–506, 507f Energetics of ATP hydrolysis, 91–92 of Na⫹/glucose cotransporter, 161, 161f of oxygenic (O2-releasing) photosynthesis, 218 of solute movement, 148–149 Energy, 87. See also Activation energy; ATP in ATP formation binding change mechanism, 201–202 transduction, 202 capture and utilization, 110–115

cell acquisition and use of, 5, 5f covalent bonds and, 33 from electron-transport chain, 187 for exercise, 188HP during glycolysis, 183 information content and, 89 from ion gradients, use in cotransport, 160–161 in ionic gradients, 189 use in cotransport, 161f law of conservation, 87–88 metabolic regulation, 115–117 in mitochondria, 189 in photons, 216 in proton electrochemical gradient, 202 sources, 5 for motor proteins, 334 in plants, 227–228 storage in carbohydrates, 228 in fats, 186f forms for, 108 of system, 87–88, 88f changes in, 88f in TCA cycle, 185–187 Energy transduction, 87–88, 87f, 122 in photosynthesis, 219 plasma membrane in, 122 role of membranes, 122f Energy transfer, in oxidation, 112f Enhancers, 525–526 Enolase, 111f Enthalpy change (⌬H), 89–90, 90t Entropy, 88–89, 89f hydrophobic interactions and, 37fn Entropy change (⌬S), 88–90, 90f, 90t Enzymes, 6, 50, 94–105 activation energy and, 96–97, 96f active site, 86f, 97–98, 98f, 101f allosteric modulation and, 115–116, 116f covalent modulation, 115 specificity, 98 allosteric modulation of, 115–117, 116f antibiotic resistance and, 106HP–108HP bacterial cell wall and, 106HP catalytic activity, 95t complex with substrate, 86f, 97–98, 97f, 101f conformational changes, 100–102, 101f, 115 covalent modification, 115 effects on reaction rate, 95, 95t, 103f effects on substrates, 98f, 99–102 electrostatic charge, 86f feedback inhibition, 115, 116f in glycolysis, 111f of Golgi complex, 278, 278f, 292–293, 292f retrograde transport, 294f in transport vesicles, 294, 294f, 297f inorganic catalysts compared with, 95 kinetics, 102–105, 105f localization by membranes, 122f of lysosomes, 303–304, 304t disorders of, 306HPt metabolic, oncogenes encoding, 681 in metabolic regulation, 115–117, 116f operation of, 97, 98f, 99–102, 99f direction of reactions, 199, 200f pH and, 103–104, 104f properties, 95–96 redox control, 228–229, 229f relation to genes, 427–428 restriction, 764–766, 765f RTK interaction with, 639f, 640

I-12 INDEX Enzymes (continued ) specificity, 95 temperature and, 103–104, 104f turnover number (catalytic constant), 95t world’s worst, 228 Enzyme inhibitors, 104, 105f, 106HP–108HP Enzyme replacement therapy, 307HP Epidemiologists, 668 Epidermis, 236f Epidermolysis bullosa simplex (EBS), 356 Epigenetics, 509-510, 531-532 in cancer, 670 oncogenes role in, 680–681 in tumor-suppressor genes, 671f transmission during replication, 562-564 Epigenomes, 510 Epinephrine, in cell signaling, 631–634, 633f, 634f Epithelial cells cilia, 345 cytoskeleton components, 325f intestinal, 3–5, 4f plasma membrane, 144, 144f secretory structure, 280, 281f junctions between cells, tight junctions, 260–262, 261f primary cilia, 349HP, 349HPf, 350HP Epithelial-mesenchymal transition (EMT), 254, 254f Epithelial tissue, 236 Epitope, 712 EPR spectroscopy. See Electron paramagnetic resonance spectroscopy Epstein-Barr virus, cancer caused by, 668, 680 Equilibrium in chemical reactions, 90–91 steady-state metabolism compared with, 93–94, 94f Equilibrium centrifugation, of nucleic acids, 547f, 762, 762f Equilibrium constant (Keq), 40, 90–91, 91t Equilibrium potentials, 165–166, 165f ER. See Endoplasmic reticulum ERAD. See ER-associated degradation ER-associated degradation (ERAD) of aberrant proteins, 288 erbB oncogene, 680 Erythroblastosis fetalis, 713 Erythrocytes, 55f, 145–147, 150f Erythrocyte ghosts, 145–147, 146f D-Erythrose, 44f L-Erythrose, 44f ES cells. See Embryonic stem cells Escherichia coli, 13, 18f ESCRT complexes, 312 Ester bonds, 41 Estrogen, 49, 49f, cancer and, 669 gene expression, and 511, 511f Etching, freeze, for TEM, 359f, 745–746, 745f, 746f Ethane, 41 Ethanolamine, phosphoglyceride linked to, 126 Eubacteria, 14, 29EP genes, in Archaebacteria, 29EP Euchromatin, 498–499, 498f Eukaryotes cell cycle of, 573f cloning in, 513, 513f DNA replication in, 558–559 chromatin structure and, 562–564, 563f initiation, 559–560, 559f, 560f nuclear structure and, 562, 563f

replication fork activities, 561–562, 561t, 562f restriction to once per cell cycle, 560–561, 560f as domain, 29EP evolutionary relationships, 28EPt gene expression control in, 488–512, 488f chromosomes and chromatin, 493–509 epigenetics, 509–510 nuclear envelope, 488–493, 489f nuclear pore complex, 490–492, 490f, 491f, 493f organization of nucleus, 510–512, 510f, 511f, 512f RNA transport, 492–493 gene regulation in, 512–514, 513f, 514f genome complexity, 400–401, 401f genome sizes, 412f meiosis in, 603, 603f without mitochondria, 27EPfn mRNA structure, 444, 444f phylogenetic tree, 29EPf ribosomes of, 435, 435f single-celled (unicellular), 15–16, 15f transposable genetic elements, 410, 410f Eukaryotic cells, 7–8 advent of, 9f cell division of, 12, 12f cytoplasm, 12 DNA transfer into, 775–778, 776f, 777f flagella of, 12, 13f organelles of, 8f–9f, 10f origin of, 26EP–30EP, 27EPf prokaryotic cells compared with, 8–13, 8f–9f, 10t DNA of, 10 evolutionary relationships, 28EP–30EP RNA polymerases, 433–434 shared properties of, 8–9 structure of, 10–12, 11f transcription factors, 434 size of, 17, 19f structure of, 8f–9f, 10f, 18f types of, 15–17, 15f, 16f Eukaryotic DNA, cloning of in bacterial plasmids, 767–768, 767f, 768f in phage genomes, 768–769, 769f Eukaryotic genes, determining function of knockout mice in, 778–780, 779f, 779fn RNA interference in, 780, 781f in vitro mutagenesis in, 778 Eukaryotic K⫹ channels, 153–156, 154f, 155f, 156f Evolution 2R hypothesis, 406–407 of cancer cells in a tumor, 669-670 of cells, 7 changing roles of RNA, 454 comparative genomics, 413–414 conserved cellular organization and, 5 gene modifications and, 407–411 introns and, 454 invertebrates into vertebrates, 406 molecular, 26EP–30EP, 27EPf mRNA surveillance and, 475 neutral, 413fn of proteins, 76–77, 76f, 77f domains, 411 RNA splicing and, 454 role of extra gene copies, 406 role of mobile genetic elements, 410–411 role of mutations, 390 role of transcription factors, 414 role of transposable genetic elements, 411 segmental duplication and, 407fn study via mitochondrial DNA, 182 “test-tube evolution,” 455

Evolutionary relationships, study via microsatellite DNA, 402 Exchangers, in secondary active transport, 161 Excitation-contraction coupling, 370–371 Excitation energy, 218 transfer of, 218, 218f, 220–221, 220f Excited electrons photosynthetic unit and, 218 transfer of in photosystem I, 223f in photosystem II, 220–221, 220f Excited state of molecule, 216 Exercise, and aerobic compared with anaerobic metabolism, 188HP Exergonic processes, 90 Exergonic reactions, coupling to endergonic reactions, 92–93 Exocytosis, 272f, 302–303, 303f with autophagy, 305f Exons, 407–408, 446 acting as introns, 454 ligation (joining), 449–450, 450f, 451f, 452f shuffling, 454 trinucleotide repeats and, 404HPf Exon-exon junctions, 475 Exonic splicing enhancers (ESEs), 450, 452f, 453 Exon-intron junctions, 449–450, 452f Exon-junction complex (EJC), 475 Exonuclease activities, of DNA polymerases, 556–558, 557f, 558f Exothermic reactions, 88 Expansins of cell wall, 268 Exportins, 492 Extavia, 726HP Extracellular environment, interaction with cells, 6, 235–269, 236f, 260f Extracellular materials degradation by metalloproteinases, 244 interactions with cells, 244–250 Extracellular matrix (ECM), 236–244, 237f functions, 236–237 organization, 239f Extracellular messengers, 618–621, 619, 619f. See also Hormones receptor down-regulation and, 312 survey of, 621 Extracellular proteins, 238, 239f cell differentiation and, 260, 260f degradation by metalloproteinases, 244 in embryonic development, 242f, 243, 243f Extracellular space, 236–244 Extremely drug resistant (XDR), 106 Extremophiles, 14 F F0 portion of ATP synthase conformational changes in, 204–205, 204f function, 204–205, 204f structure, 199–200, 200f, 205 F1 head (coupling factor 1) of ATP synthase, 199–200, 200f catalytic sites, 200–202, 202f conformational changes, 202, 203f L, T, and O conformations, 202, 203f rotational catalysis, 202–204, 203f proton gradient and, 200f, 201 Facilitated diffusion across membranes, 156–157, 156f, 157f, 161f Facilitative transporter, 148f, 156, 156f, 157f F-actin. See Actin Facultative heterochromatin, 498, 498f

INDEX I-13 FAD, structure of, 191f in electron transport, 193 in fatty acid cycle, 186f in glycerol phosphate shuttle, 187f in oxidative phosphorylation, 187f in TCA cycle, 186 Familial adenomatous polyposis (FAP), tumorsuppressor genes in, 673t, 678 Familial hypercholesterolemia (FH), and endocytosis, 319EP–320EP Faraday constant, 219f Fast-twitch muscle fibers, 188HP, 188HPf Fats, 47–49, 48f average store of, 49 chemical energy of, 49 components of, 47–48, 48f energy storage, 186f fatty acids in, 48–49, 48f insulin action, 644 synthesis of, electron transfer and, 114–115 Fatty acids differences in, 48–49 in exercise, 188HP in fats, 47, 48f melting points of, 139t in membranes, saturation state, 126, 138–139, 138fn as precursors, 42 properties of, 47–48, 48f structure of, 47, 48f TCA cycle and, 186, 186f Fatty acid cycle, 186f Fatty acyl chains, of membrane lipids, 123, 123f, 126 Feedback inhibition, of metabolic pathways, 115, 116f Fermentation, 95, 113–114, 114f, 515-517 Ferredoxin, 223, 223f in redox control, 228–229, 229f Ferredoxin-NADP⫹ reductase (FNR), 223, 224f Fertilization Calcium waves during, 650, 650f chromosome number and, 388f, 389 cyclins and, 613EP, 613EPf maturation-promoting factor activity and, 612EP stage of life cycle, 603f, 604f Feulgen stain, 735, 735f FG repeats, in nucleoporins, 492 FH. See Familial hypercholesterolemia Fibrillar collagens, 239–240, 239f Fibrinogen, and integrins, 245–246, 245t, 247f Fibroblasts, 181f, 236, 236f, 751f locomotion, 326f, 375f, 376f, 377, 377f, 378f traction forces, 249, 249f, 379f mitochondria and microtubules, 178f, 179f Fibronectin, 239f, 241–243 alternative splicing of, 534, 534f binding sites, 242–243, 242f binding to integrins, 242f, 245t cell adhesion and, 235f cell migration and, 243, 751 in embryonic development, 242f, 243, 243f production of, 534 structure, 242f Fibrosis, 240 Fibrous proteins, 58 Fidelity, in DNA replication, 557–558, 557f, 558f Filaments. See Actin filaments; Contractile filaments; Microfilaments Filament-severing proteins, 373–374, 373f Filamin, 373 Fimbrin, 373 Finasteride, 669 “Fingerprinting,” of DNA, 402, 402f

“Fingerprints” peptide mass, of proteins, 71f, 72 FISH. See Fluorescence in situ hybridization Fish oil, 126 Fixatives for bright-field microscopy, 735 for electron microscopy, 742 Flagella, 345–353, 346f, 347f, 746f. See also Axonemes bacterial, 8f–9f, 345 basal bodies and, 340, 346–347, 348f beating patterns (waveforms), 345, 346f, 350f, 353f cilia compared with, 345 eukaryotic, 347f prokaryotic compared with, 12, 13f intraflagellar transport, 350, 350f locomotion mechanism, 351–352, 352f, 353f role of dynein, 350–351, 351f sliding-microtubule theory, 353f prokaryotic, 345 of sperm, 13f, 431f Flagellar (ciliary, axonemal) dynein, 350–351, 351f Flavoproteins, 191 Flippases, 285 Fluidity, viscosity compared with, 138fn Fluid-mosaic model of membrane structure, 124–125, 124f Fluorescence, 736 dual-label, 321f, 732f Fluorescence in situ hybridization (FISH), 402–403, 403f, 510, 510f Fluorescence microscopy, 736–738, 737f, 738f in cytoskeleton studies, 326–330, 326f, 327f in endomembrane studies, 273-275 laser scanning confocal, 739–740, 739f super-resolution, 740, 740f Fluorescence recovery after photobleaching (FRAP) technique, 142, 142f of cytoskeleton, 329–330, 330f Fluorescence resonance energy transfer (FRET), 137, 627f, 738, 738f Fluorescence speckle microscopy, 327, 344f Fluorophores, 736 Flurizan, 69HP–70HP FMN, oxidized and reduced forms, 191f Fn modules, 241–242, 242f, 252f Focal adhesions, 248, 248f, 259f signal transmission and, 248–249, 248f, 666 traction forces and, 248–249, 248f, 376, 378-379, 379f Folding of introns, 451f of proteins, 63-70, 80EP-83EP, 287-288 of RNA, 429 Force generation by actin polymerization, 375f, 377–378 during cytokinesis, 599-600 during mitosis, 594-596, 597 in ciliary and flagellar motion, 351–352, 352f by ciliary/flagellar dynein, 351f in mitosis, 338 in motility, 249, 249f, 326, 337, 337f, 351, 352f, 374, 375f, 377–378, 379f fuel for, 334 by motor proteins, 334 in neural tube formation, 382f Formin, in actin filament nucleation, 372, 599 Formylmethionine, 468–469, 468fn Forward genetics, 778 FOXP3 gene mutations, 710

Fractionation of cells, 275-276, 276f, 752, 752f of nucleic acids, 760 by gel electrophoresis, 760, 761f by ultracentrifugation, 760–762, 762f of proteins, 71, 71f, 752 liquid column chromatography, 753–756, 753f, 754f, 755f PAGE, 756–757, 756f, 757f protein measurement and analysis techniques, 757–758, 758f, 758fn selective precipitation, 752–753 Fragile X syndrome, 404HPf, 405HP Frameshift mutations, 473–474 in genetic code, 464f FRAP technique, Se Fluorescence recovery after photobleaching technique Free energy, 89–94 of proton-motive force, 198 release in electron transfer, 192f, 193 Free energy change (⌬G), 89–94, 90t in biochemical reactions, 90–92 in glycolysis, 110, 111f in metabolic reactions, 110–111, 111f reactant/product ratio and, 91–92 standard change, 91–92, 91fn, 91t, 110 when solutes cross membranes, 148–149 Free radicals, 34, 35HP quinones, in photosynthesis, 220f symbol for, 35HP Freeze etching, for TEM, 359f, 745–746, 745f, 746f Freeze-fracture replication technique, for TEM, 745–746, 745f, 746f Freeze-fracture replication of membranes, 132, 132f Friedreich’s ataxia, 404HPf Frozen specimens, for TEM, 743–744 D-Fructose, 43f Fructose 1, 6-bisphosphatase, 116f, 117 Fructose 1, 6-bisphosphate, 111f, 112 Fructose 6-phosphate, 111f, 112, 116f, 117 Functional groups, 41, 41t Fungi, origin of, 27EPf Fused cells, 141, 141f Fusion pores, 303, 303f G G-actin (actin-ATP monomers), 372 Gain-of-function mutations, 404HP in GPCRs, 626HP oncogenes arising from, 671f, 672 Galactocerebroside, 127 Gametes chromosome number, 388f, 389 formation of, 603, 603f genetic properties, 387–388 piRNAs in, 460–461 Gametic meiosis, 603, 603f Gametophyte, 603, 603f Gamma radiation, 749 Gangliosides, 126, 126f Ganglioside disorders, 307HP Gap junctions, 262–264, 263f Gap-junction intercellular communication (GJIC), 263, 264f Gap-junction plaques, 263, 263f GAPs. See GTPase-activating proteins Gas constant (R), 91 Gases, in cell signaling, 621 Gastric cancer diet role in, 668, 668f genes associated with, 673t, 675f

I-14 INDEX Gastrulation basic body plan and, 349HP cell migration in, 381, 382f Gated ion channels, 152 cotransporters and, 161 protein pumps compared with, 161 in synaptic transmission, 169, 169f Gaucher’s disease, 306HPt, 307HP GEFs. See Guanine nucleotide-exchange factors Gel electrophoresis. See electrophoresis Gel filtration chromatography, 754, 754f Gelsolin, 374 Genes activation of, by hormone, 525, 525f of B- and T-cell antigen receptors, 713–716, 714f, 715f changing concept of, 426–428 chemical nature, 393–398, 420EP–423EP CJD and, 66HP disease-associated, identifying them, 417HP–420HP divergence, 408, 408f duplications, 407–408, 407f, 408f, 416f types of, 407fn elimination/silencing of knockout mice, 778–780, 779f, 779fn RNA interference, 780, 781f in vitro mutagenesis, 778 historic discoveries, 387f of human genome, 70 “jumping,” 408–410, 409f, 410f lateral transfer, 29EP linkage groups, 389 locus on chromosome, 391 Mendel’s concepts, 387–388 physical basis, 389 mobile, 408–411 multiple peptides from, 70fn, 428, 454 noncoding sequences, 408, 413 genome size and, 412fn of operon, 484 penetrance, 417HP–418HP, 417HPfn polymorphic, 416, 417HP–418HP, 417HPfn polypeptides and, 70 protein-coding, 403–405 DNA amount compared with, 412f, 412fn extra copies, 403–405, 417 number in human genome, 412–413 numbers in different genomes, 412f pseudogenes, 408, 408f related families, 403–405, 407, 408f relation to nucleotide sequences, 396 relation to proteins, 427–429 for rRNA, 436, 436f split, 444–448, 454 transfer into eukaryotic cells and mammalian embryos, 775–778, 776f, 777f for tRNA, 440f Genealogy, study via mitochondrial DNA, 182 Gene expression, 426–481 control of, 483–542 in bacteria, 484–488 in eukaryotes, 488–512, 488f by operons, 484–487, 486f riboswitches, 487–488 between distantly located genes, 511, 511f DNA methylation and, 530f, 531–532, 531f elimination of (inactivation), by small interfering RNAs, 457 genomic imprinting, 531f nuclear organization and, 511, 511f, 512f

posttranslational control, 541–542, 541f regulation of in eukaryotes, 512–514, 513f, 514f by microRNAs, 459–460 RNA processing control of, 533–535, 534f, 535f selective, 513 species differences, 71f differences in control, 413 transcriptional control of, 514–533, 514f, 516f, 517f DNA sites in, 522–525, 522f, 523f transcription factors for, 517–522, 518f, 520f translational control, 536–540, 536f microRNA in, 539–540, 540f mRNA stability and, 538–539, 539f Gene-expression analysis, 514-517, 516f, 517f of cancer, 685-686, 686f, 687f General transcription factors (GTFs), 522-523, 530 during basal transcription, 441–443, 442f, 443f Gene rearrangements of antibody genes, 411, 713-716 regulation by microRNAs, 460 Gene therapy, 209HP DNA delivery systems for, 163HP RNA interference, 458HP–459HP use of viruses, 26, 163HP, 689 Genetic anticipation, 404HP Genetic code, 461–464, 464f codon meanings, 463–464 decoder chart, 464f decoding by tRNAs, 464–468 degeneracy, 463 mutations in, 463, 464f nuclear compared with mitochondrial, 464 overlapping compared with nonoverlapping, 462, 463f properties, 461–464 universality, 464 Genetic engineering, 689, 776–778, 777f of plants, for C4 properties, 232 Genetic information cell division and, 5 evolutionary diversity and, 407 evolutionary relationships and, 29EP flow through cell, 428–429, 428f genome and, 398–399, 405–406 in polyploidization, 406 as property of cell, 5 storage and use of, 428 translation, 468–477 Genetic material DNA as, discovery of, 420EP–423EP functions, 396, 397f of viruses, 23, 24f Genetic polymorphisms, 416–417, 416f disease risk and, 415, 417HP–418HP, 417HPfn, 716-717 in GPCRs, 626HP of ABO blood groups, 130, 130f Genetic recombination, 390–391 during DNA repair, 568 “hot spots,” 419HP, 611 during meiosis, 610–611, 610f Genetics of cancer, 669–671, 670f, 674f cancer genome, 683–685, 683f, 684f, 685t gene-expression analysis, 685–687, 686f, 687f microRNAs, 681–683 mutator phenotype, 681 oncogenes, 671–672, 671f, 672f, 679–681, 679f, 682f

tumor-suppressor genes, 671–679, 671f, 672f, 673t, 674f, 675f, 676f, 677f, 679f historic discoveries, 387f Mendel’s concepts, 387–388 physical basis, 389 Genetic variability, in human populations, 415-420 Genomes, 386 cancer, 683–685, 683f, 684f, 685t of chloroplasts, 212, 214 comparative genomics, 413–414 complexity, 399–406 in bacteria and viruses, 400, 401f in eukaryotes, 400–401, 401f dynamic nature, 408–411 evidence of “foreign” genes, 29EP genetic information and, 398–399, 405–406 instability, 406–411 noncoding compared with protein-coding portions, 413 organization of, prokaryotes, 484 packaging of, 494–498 heterochromatin and euchromatin, 498f histone code and heterochromatin formation, 500f, 501f, 502f mitotic chromosome structure, 503f nucleosomes, 494f, 494t, 495f polyploidization, 406–407 segmental duplication, 407fn sequencing of, 411–417, 771–773, 772f size comparisons, 412f, 412fn stability, 406–411 structure, 398–406 Genome-wide association studies (GWASs), for disease-linked polymorphisms, 418HP Genome-wide location analysis, by ChIP, 523–524, 524f Genomics, comparative, 413–414 Genomic analysis, application to medicine, 417HP–420HP Genomic imprinting, 531f, 532 Genomic libraries, 773–774, 775f Germ cells embryonic migration, 243, 243f methylation state of, 531f mitotic compared with meiotic division, 389 piRNAs in, 460–461 telomere length in, 508f Germinal vesicles, 611EP GFP. See Green fluorescent protein GFP fusion protein, 274 in protein transport studies, 274, 275f GFP-tubulin, FRAP study with, 329–330, 330f GGA adaptor proteins, 299–300, 300f Ghosts, of erythrocyte plasma membrane, 145–147, 146f Giant chromosomes, 392, 392f research uses, 392 Gilenya, 726HP Glandular cells, polarity of organelles, 280, 281f Gleevec, 75–76, 75f, 690 Glioblastoma, immunotherapy for, 689 Global genomic pathway, of DNA repair, 565 Global warming, and atmospheric CO2, 230–231 Globin genes evolution of, 407–408, 408f human compared with mouse, 413 introns in, 447f Globular proteins, 58 Glomerular basement membrane (GBM), 237–238, 238f Glu. See Glutamic acid

INDEX I-15 Glucagon, in cell signaling, 631–633, 633f, 644 Glucocorticoid, secretion of, 524–525 Glucocorticoid receptor (GR), in transcription, 524–525, 525f Glucocorticoid response element (GRE), 525 Gluconeogenesis, glycolysis compared with, 116–117, 116f, 644 Glucose. See also Blood glucose ATP from, 111f, 187 in cellulose, 46f, 47 cotransport with sodium ion, 160–161, 161f in energy metabolism, 110 as energy source, 5 facilitated diffusion across membrane, 156f, 157, 157f in glycogen, 45, 46f in starch, 46–47, 46f D-Glucose, 43f Glucose 6-phosphate, 111f, 112, 116f Glucose transporters (GLUTs), 157, 645f, 646 Glutamate, as brain neurotransmitter, 170 synaptic strengthening and, 170–171 Glutamic acid (Glu, E), 51–53, 52f, 53f Glutamine (Gln, Q), 52f, 53, 404HP-405HP Glutathione peroxidase, hydrogen peroxide and, 35HP Glycans. See Carbohydrates D-Glyceraldehyde, 44, 44f L-Glyceraldehyde, 44, 44f Glyceraldehyde, stereoisomerism of, 44, 44f Glyceraldehyde 3-phosphate (GAP) in carbohydrate synthesis, 227f, 228, 228f in glycolysis, 111f, 112 oxidation, 112–113, 112f Glyceraldehyde 3-phosphate dehydrogenase, in glycolysis, 112 Glycerol, in fats, 47, 48f Glycerol backbone, of phosphoglycerides, 125–126, 125f, 126f Glycerol phosphate shuttle, 186, 187f Glycine (Gly, G), 52f, 53–54, 154, 154f Glycocalyx (GC), 236, 237f Glycogen, 45–47, 46f, 631-632 average store of, 49 diabetes and, 45, 644-646, 645f Glycolate, in photorespiration, 229, 230f Glycolipids, 126–127, 127t in membranes, 126f, 127 in myelin sheath, 127 from oligosaccharides, 45 in plasma membrane, 124f sites of synthesis, 285 Glycolysis, 110–113, 111f, 183–185. See also ATP formation anaerobic pathway, 113 in cancer cells, 667 free-energy change in, 92, 111f gluconeogenesis compared with, 116–117, 116f for muscle ATP needs, 188HP net equation, 113, 184f overview, 184f pyruvate dehydrogenase and, 61 Glycophorin A, 134, 134f, 135f in erythrocyte plasma membrane, 145, 146f, 147 Glycophosphatidylinositol (GPI)-anchored membrane proteins, 131f, 137 lipid rafts and, 139, 140f Glycophosphatidylinositol (GPI) linkage, in peripheral membrane proteins, 137 Glycoproteins assembly, in rough endoplasmic reticulum, 285–288, 287f

formation of, 51 misfolded, 287–288 destruction of, 288 modification, in Golgi complex, 292 from oligosaccharides, 45 in plasma membrane, 124f, 129, 130f screening for defects, 287–288, 288f Glycosaminoglycans (GAGs), 46f, 47, 47f in proteoglycans, 240, 241f Glycosidic bonds, 44, 45f, 46f, 129, 130f in glycogen, 44, 45f, 46f Glycosylation, 129 in Golgi complex, 292, 293f mutations in, 286–287 in rough endoplasmic reticulum, 285–288 Glycosyltransferases, 286 Glyoxysomes, 206, 207f Golgi complex, 8f–9f, 10, 11f, 12, 272f, 290–295, 291f cytoplasmic dynein and, 337f enzymes in, 278, 278f, 292–293, 292f, 293f transport in vesicles, 294, 294f glycosylation in, 292, 293f membrane skeleton of, 290 microtubules and, 337f, 601 morphology, 290, 291f origin of, 27EP partitioning during mitosis, 588 polarity, 291f secretory protein dynamics and, 274f, 281f study via cell fractionation, 275 transport through, 292–295, 294f, 297f cisternal maturation model, 293–294, 294f vesicular transport model, 293–294, 294f Golgi stack, 290, 290fn, 291f “Good” cholesterol, 314 GPCRs. See G protein-coupled receptors GPI. See Glycophosphatidylinositol G proteins. See GTP-binding proteins G protein-coupled receptor kinase (GRK), 624 G protein-coupled receptors (GPCRs), 621-636, 653f, 654f in blood glucose regulation, 631–632, 631f cAMP in glucose mobilization, 632, 632f, 633f other aspects of cAMP signal transduction, 632–634, 633f, 633t, 634f, 635f signal amplification in, 632, 633f disorders associated with, 625HP–626HP, 625HPf, 626HPt second messengers of, 621–622, 622f, 622t cAMP, 627, 627f phosphatidylinositol-derived, 627–629, 628f, 629f phospholipase C, 629–630, 629f, 630f, 631t in sensory perception, 634–636 signal transduction by, 622–625, 623f, 625f bacterial toxins affecting, 627 G proteins, 623f, 624 receptor structure, 622–624, 623f termination of response, 623f, 624–625, 625f specificity of responses of, 630–631 Grana, 214 Grana thylakoids, 214f Granzymes, 709 Graves’ disease, autoimmunity and, 725HP Grb2, 639–640, 639f Green fluorescent protein (GFP), 737–738, 737f in cytoskeleton studies, 326–327, 326f photoactivatable, 740, 740f in protein transport studies, 273–275, 275f GRK. See G protein-coupled receptor kinase

GroEL molecular chaperone, 81EP–83EP, 81EPf, 82EPf, 83EPf conformational changes, 81EP–82EP, 82EPf, 83EPf polypeptide binding site, 82EP–83EP GroES molecular chaperone, 81EP–83EP, 82EPf, 83EPf Ground state of molecule, 216 Growth, of cancer cells, 665–667, 666f, 667f Growth cones of nerve cells, 325f, 379–381 directed movement, 380f, 381, 381f structure, 325f, 380f type II myosin in, 361f Growth factors cancer cells and, 666, 666f oncogenes encoding, 679f, 680 Growth factor receptors, 636-646 oncogenes encoding, 679f, 680, 682f, 688-689 Growth hormone (GH), Somavert from, 75 GTFs. See General transcription factors GTP (guanosine triphosphate), 79 in microtubule assembly, 342–344, 342fn GTPase-activating proteins (GAPs), 641, 673t GTP-binding proteins (G proteins), 283fn, 492, 621–622, 622f, 622t In cancer, 679, 682f, 684, 684f of coated vesicles, 296–297 in cytokinesis, 599 GTP- compared with GDP-bound, 283 in nucleocytoplasic transport, 492 in secretory protein synthesis, 283 in signal transduction, 623f, 624 in signal transmission, 260, 624, 626HP, 627, 640-644 in translation, 468-474, 536 structure and cycle of, 640–641, 641f in tethering vesicles to targets, 301–302, 301f GTP hydrolysis microtubule assembly and, 342–344, 343f in secretory protein synthesis, 282f, 283 Guanine (G), 78, 393f, 394 base pairing, 395f–396f structure, 78f Guanine nucleotide-dissociation inhibitors (GDIs), 641 Guanine nucleotide-exchange factors (GEFs), 641 Guanylyl cyclase, NO activation of, 656 H Hair cells of ear, 364, 365f stereocilia, 364, 365f, 373 Half-life of radioactive atoms (t1/2), 749, 749t Half-life of mRNAs and proteins, 538, 542 Halobacterium salinarum, 160 Halophiles, 14 Hammerhead ribozyme, 78–79, 78f Haploid cells, use in research, 277–278 Haploid state, 18f, 399 Haplotypes, 419HP disease-associated, identifying them, 419HP SNPs, 419HP, 419HPf HapMaps, 419HP HA protein. See Hemagglutinin protein HATs. See Histone acetyltransferases HDACs. See Histone deacetylases HDLs. See High-density lipoproteins Head groups of membrane lipids, 123, 123f, 126, 126f Hearing, and gated ion channels, 152 Heart attacks cell replacement therapy for, 20HP integrin-targeting drugs for, 246–247

I-16 INDEX Heart disease cholesterol and, 314, 314f iPS cells for, 22HP Heart muscle ATP production, 188HP gap junctions and, 263–264 Heart muscle cells, role of smooth endoplasmic reticulum in, 280, 649-650 Heat shock genes, 433 Heat shock proteins, 80EP Heat shock response, 80EP Heavy chains, of immunoglobulins, 80, 710–713, 710t, 711f, 711fn, 713f genes encoding, 713–716, 714f, 715f HeLa cells, 3, 3f Helicases in pre-mRNA splicing, 451 in replication, 545f, 554f, 555f, 560-561, 561t, 562f in transcription, 442–443 Helicobacter pylori, cancer caused by, 668 Helix, helices. See also Alpha helix in actin filaments, 358f of DNA, 386f, 394, 395f–396f, 396 actions during transcription, 430f, 431, 432f of double-stranded RNA, 78f, 429, 429f, 465, 466f, 470 transmembrane (membrane-associated), 120f, 131f, 133f, 134, 134f, 134fn, 135f, 146f in ion channels, 153–154, 153f, 154f, 155f, 156f, 174EPf in membrane fusion, 303f Helix-loop-helix (HLH) motif, 521, 521f Helper T cells (TH cells), 709–710, 709f, 709fn during AIDS, 710 APC activation of, 722–723, 722f APC interactions with, 718f, 720–721 B cell activation by, 722f, 723 development of, 721 Hemagglutinin (HA) protein, protein engineering for, 74 Hematopoietic stem cells (HSCs), 20HP as autoimmune disease therapy, 726HP differentiation of, 703–704, 703f telomerase in, 508 Heme groups in cytochromes, 191, 191f, 196–197, 197f in myoglobin, 58, 103f in succinate dehydrogenase, 191 Hemicelluloses, 266f, 267f, 268 Hemidesmosomes, 249–250, 250f, 259f Hemoglobin conformation of, 55, 56f embryonic vs. fetal vs. adult forms, evolution of, 407–408, 408f quaternary structure of, 60, 61f Heparin, 47 Hepatitis B virus, cancer caused by, 668 Heptoses, 43 HER2, cancer therapy targeting, 688 Herbicides, and electron transport, 225 Herceptin, 688 Heredity, Mendel’s conclusions, 387–388 physical basis, 389 Herpes viruses, cancer caused by, 668 Heterochromatin, 498–499, 498f constitutive, 498 facultative, 498, 498f formation of, 499–501, 500f, 501f, 502f, 532–533 X chromosome inactivation, 498f, 499 Heterochromatin protein 1 (HP1), 501 Heterodimer, 60, 61f, 330–331, 331f, 519–522, 521f,

Heteroduplex, 610, 610f Heterogeneous nuclear ribonucleoproteins (hnRNPs), 450 Heterogeneous nuclear RNAs (hnRNAs), 440f, 441, 444–448, 445f splicing and, 450 Heteroplasmy, 208HP Heterotrimeric G proteins. See GTP-binding proteins Heterotrophs, 26EP–27EP, 211 Hexoses, 43 HIF, of cancer cells, 667 High-density lipoproteins (HDLs), 314–315 High-energy electrons, 183fn in ATP formation, 186–187, 187f in photosynthesis, 215f Highly repeated DNA sequences, 401–403 sequencing of human genome and, 412 High performance liquid chromatography (HPLC), 753 High-speed AFM (HS-AFM), 748, 748f Histidine (His, H), 51–53, 52f, 53f Histology, of cancer cells, 607f, 670 Histones, 494 alternative versions (variants) of, 495–496, 509 chromatin and, 526-527, 527f, 530 classes of, 494–495, 494t, 495f DNA replication and, 509–510 linker, 495 Histone acetyltransferases (HATs), 527, 528f Histone code, 499–501, 500f, 501f, 502f Histone deacetylases (HDACs), 530 Histone demethylase, 533, 533f Histone methyltransferase, 501, 502f, 533 HIV. See Human immunodeficiency virus H⫹/K⫹-ATPase, 159, 159f HLA-B*35 allele, 717 HLA-DRB1*1302, 717 HLH motif. See Helix-loop-helix motif HMG CoA reductase, 319EP, 319EPf anti-cholesterol medications, 314 Hodgkin’s lymphoma, immunotherapy for, 688 Holliday junctions, 610–611, 610f Homodimer, 60, 61f Homogenization of cells, for fractionation, 275, 752 Homogenizers, 752 Homologous chromosomes (homologues), 389, 389f, 390f association and segregation in meiosis, 388, 602–608 separation in meiosis, failure of, 608HP Homologous proteins, 76–77 in electrophoretic gels, 71, 71f in yeast and humans, 18f, 560-561, 576 Homologous recombination (HR), 568, 610-611, 779 Homology modeling, 133 Homology search, 610 Hormones, 50, 618 information transport across membrane and, 122f steroids, in cell signaling, 621 Hosts range of viruses, 24–25 of viruses, 24 “Hot spots,” in genetic recombination, 419HP, 611 Housekeeping receptors, 312, 313f HPV. See Human papilloma virus Hsc70, 311 Hsp60 chaperones, 81EP in chloroplast protein import, 318f Hsp70 chaperones, 65, 81EP in chloroplast protein import, 318f in mitochondrial protein import, 316, 317f

Hsp90 chaperones, in mitochondrial protein import, 316 Hub proteins, 62–63, 63f Humans evolution of, 414–416 genetic similarity, 416 genetic variations, 416 origins of genetic variation and, 402 haplotypes and, 419HP–420HP Human cell lines, 751 Human genome Alu repeated sequences, 410–411 chimpanzee genome compared with, 414–415 conservation in, 413–414, 413fn copy number variations, 417 “dark matter,” 412 DNA of, 513 DNA sequence variations, 416 genes of, 70 mouse genome compared with, 413 number of base pairs, 412, 412f number of genes, 412 number of protein-coding genes, 412 number of base pairs compared with, 412f number of proteins compared with, 412–413 rDNA clusters of, 435 sequencing of, 412, 771–773, 772f size, base pairs compared with gene number, 412f SNPs, 416 transcription selections of, 461 variations in, 416–417, 416f Human immunodeficiency virus (HIV) cancer caused by, 668 dynein transport of, 338 evolutionary origins, 410 helper T cells and, 710 infection by, 25, 25f replication of, 410 structure of, 24, 24f transcription of, 108HP Human longevity, cell signaling role in, 647HP–648HP, 647HPf Human microbiome, 15 Human migration, 182, 415. 419HP Human monoclonal antibodies, cancer therapy using, 688, 693 Human papilloma virus (HPV), cancer caused by, 668 Humira, 726HP, 783 Humoral immunity, 703–704 Huntington’s disease CAG trinucleotide repeats in, 404HP, 404HPf genetics of, 404HP molecular basis, 404HP–405HP Hutchinson-Gilford progeria syndrome (HGPS), 490 Hyaluronic acid, 240 Hybridization fluorescence in situ hybridization, 402-403, 403f nucleic acid, 762–764, 763f in situ, in DNA cloning, 767–768, 768f Hybridomas, 782–783 Hydrocarbons, 41 Hydrochloric acid, 39, 39t Hydrodynamic radius, protein separation based on, 754 Hydrogenation, 49 Hydrogen bonds, 36, 36f, 37f in alpha helix, 55 in beta-pleated sheet, 56

INDEX I-17 in DNA, 36, 36f dynamic changes in proteins of, 59–60 in myoglobin, 59, 59f between water molecules, 37–38, 38f Hydrogen ions, 39fn donation of, 39 exclusion from aquaporin channels, 151, 151f pH and, 39 in symport and antiport, 161 transport across membranes, 122f by V-type pumps, 159–160 Hydrogen peroxide, 35HP in peroxisomes, 206 Hydrogen sulfide (H2S) for photosynthesis, compared with water, 212 structure of, 37fn Hydronium ion (H3O⫹), 39 Hydropathy plot, 134–135, 135f Hydrophilic amino acid residues, 54 Hydrophilic molecules, 36 Hydrophilic regions of integral membrane proteins, 130–132 of membrane lipids, 123f, 126, 126f, 127f Hydrophobic amino acids, 54 in integral membrane proteins, 134–135, 135f, 174EP, 174EPf in plasma membrane, 134–135 Hydrophobic interactions, 36–37, 37f, 38f of nonpolar amino acids, 54 in plasma membrane, 123, 123f Hydrophobic molecules, 36 fat as, 49 Hydrophobic regions of integral membrane proteins, 124f, 130–133, 135f transport into endoplasmic reticulum membrane and, 284, 284f of membrane lipids, 123, 123f, 126, 126f _ Hydroxyl ion (OH ), 39 Hydroxyl radical, 35HP Hyperthermophiles, 14 Hypertonic (hyperosmotic) solution, 149, 150f Hypoparathyroidism, GPCRs mutation in, 626HP Hypotonic (hypoosmotic) solution, 149, 150f H zone of sarcomere, 366, 366f I I bands of sarcomere, 366, 366f ICAMs, 255HP, 255HPf I-cell disease, 306HP Icosahedron capsid, 24, 24f IgA molecules, 710t, 711 IgD molecules, 710t, 711 IgE molecules, 710–712, 710t IGF-1. See Insulin-like growth factor IgG molecules, 710–713, 710f, 710t, 711f, 711fn IgM molecules, 710–712, 710f, 710t Image processing, 738–739 Imaging techniques, live-cell imaging, 326 Immune response, 699–731 in autoimmune diseases, 706, 724HP–726HP regulatory T cell role in, 710 T-cell selection role in, 721 therapy for, 725HP–726HP cellular and molecular basis of immunity, 710 antibody molecular structure, 710–713, 710t, 711f, 711fn, 712f, 713f distinguishing self from nonself, 721–722, 721f DNA rearrangements producing genes encoding B- and T-cell antigen receptors, 713–716, 714f, 715f

lymphocyte activation by cell-surface signals, 722–723, 722f major histocompatibility complex, 716–721, 717f, 718f, 719f, 720f, 721f, 725HP, 727EP–730EP, 727EPt, 728EPt, 729EPf, 730EPf membrane-bound antigen receptor complexes, 716, 716f signal transduction pathways in lymphocyte activation, 723–724 clonal selection theory applied to B cells, 704–706, 704f, 705f, 706f vaccination, 706–707 overview of, 700, 701f adaptive immune responses, 701f, 703–704, 703f innate immune responses, 700–703, 701f, 702f T cell activation and mechanism of action, 707–710, 707f, 708f, 709f, 709fn Immune system, 699–700, 700f cells of, 699–700, 699f tight junctions of blood-brain barrier and, 262 free radicals in, 35HP “genetic,” 457 lymphoid organs of, 699, 700f Immunity, 699 cellular and molecular basis of, 710 antibody molecular structure, 710–713, 710t, 711f, 711fn, 712f, 713f distinguishing self from nonself, 721–722, 721f DNA rearrangements producing genes encoding B- and T-cell antigen receptors, 713–716, 714f, 715f lymphocyte activation by cell-surface signals, 722–723, 722f major histocompatibility complex, 716–721, 717f, 718f, 719f, 720f, 721f, 725HP, 727EP–730EP, 727EPt, 728EPt, 729EPf, 730EPf membrane-bound antigen receptor complexes, 716, 716f signal transduction pathways in lymphocyte activation, 723–724 Immunization for Alzheimer’s disease, 69HP passive, 69HP Immunofluorescence, 737, 783 Immunoglobulin (Ig). See also Antibodies as cell-adhesion molecules, 252–253, 252f domains of, 252, 252f genes encoding, 713–716, 714f, 715f molecular biology techniques using, 780–783, 782f molecular structure of, 710–713, 710t, 711f, 711fn, 712f, 713f sites of synthesis, 280 Immunoglobulin superfamily (IgSF), 252–253, 703 as cell-adhesion molecules, 252f, 259f in inflammation, 255HP Immunologic memory, 705–706 Immunologic rejection, 716-717 Immunologic synapse, 717, 718f Immunologic tolerance, 706, 724HP Immunosuppressive drugs, 725HP Immunotherapy, for cancer, 688–689 Importins, 492 Inborn errors of metabolism, 427 Incomplete linkage, 390–391 Independent assortment, law of, 388, 389 Indirect immunofluorescence, 783 Indomethacin, cancer prevention with, 669 Induced fit, of enzyme and substrate, 100, 101f Induced pluripotent stem cells (iPS cells), 22HP–23HP, 22HPf

development of, 22HP ES cells compared with, 22HP–23HP generation of, 22HP for inherited disorders, 22HP, 22HPf issues with, 22HP teratomas with, 22HP–23HP transcription factors and, 519 Inducer, 485 Inducible operon, 485 Infections antibiotic-resistant, 106HP–108HP bacterial, bacteriophages as therapy, 26 with prions, 66HP susceptibility to, MHC protein role in, 716–717, 717f types of, 700 viral, 25–26, 25f Inflammation in cancer, 668–669 in immune response, 702 role of cell adhesion in, 255HP–257HP Inflammatory bowel diseases (IBDs) autoimmunity and, 725HP RNA interference therapy for, 459HP Influenza 1918 pandemic, 25 RNA interference therapy, 458HP Influenza virus host range of, 25 jumping from birds to humans, 25 Inheritance of cancer, 664, 669, 673, 675, 678 of complex diseases, 417HP-419HP epigenetic compared with genetic, 509-510 mitochondrial compared with Mendelian, 208HP Initiation, of DNA replication, 559–561, 559f, 560f Initiation codons, 468, 468fn, 469f, 470 reading frame and, 473–474 Initiation complex for translation, 469, 469f Initiation factors for translation (IFs, eIFs), 468, 469f Initiation site (start point) in eukaryotic transcription, 442f, 443f, 522-523 in prokaryotic transcription, 432f, 433, 433f Initiator (Inr), in eukaryotic promoters, 442f INK4a gene, in cancer development, 678 Innate immune responses, 700–703, 701f, 702f Inner mitochondrial membrane (IMM), 180, 181f, 187f, 194f, 199f, 205f, 316, 317f Inorganic molecules, 40 Inositol 1, 4, 5-trisphosphate (IP3), 629f, 630, 631t “Inside-out” signaling by integrins, 245–246 In situ hybridization, 402 in DNA cloning, 767–768, 768f Insulators, 525 Insulin in blood glucose regulation, 631, 644, 644f in cancer, 668–669 glucose transporters and, 157 sequencing of, 55 Insulin-like growth factor (IGF-1) in cancer, 668–669 in human longevity, 647HP–648HP Insulin receptor, signaling by, 644 diabetes mellitus and, 646 glucose transport, 646, 646f human longevity and, 647HP–648HP, 647HPf insulin receptors as protein-tyrosine kinases, 644, 644f insulin receptor substrates 1 and 2 in, 644–646, 645f Insulin-receptor substrates (IRSs), 644–646, 645f

I-18 INDEX Insulin resistance, 646 Integral membrane proteins, 124f, 130–137, 131f. See also Cell surface receptors in cell-cell adhesion, 251 conformational changes in, 135–137, 136f, 137f crystallization of, 133–134, 133f cytoplasmic tails, 246f diffusion through, 148f functions, 130–132 hydrophobic regions, 124f transport into endoplasmic reticulum membrane and, 284, 284f integration into membranes, 283–284, 284f internal spatial relationships, 136–137, 137f mobility, 141–143, 141f, 143f orientation in membranes, 284, 284f, 285f sites and functions, 144f structure and properties of, 132–137 synthesis on ribosomes, 283–284, 284f synthesis site, 280 targeting to destination, 272, 300 transmembrane domains, 134–135, 134f, 135f, 146f in transmembrane signaling, 259–260 Integrins, 244–247, 246f adhesion and, 259f anti-integrin antibodies, 246–247, 247f, 726 cell survival and, 246, 666 conformations, 245, 245f, 246f, 248f focal adhesions and, 248, 248f, 376-378 functions, 246 hemidesmosomes and, 250, 250f in inflammation, 255HP–256HP, 255HPf ligand binding, 245–246, 245t, 246f, 248f, 249, 260 protein kinase activation, 246, 248f, 249 signal transmission and, 246, 248f, 249, 259–260, 654f Interference microscopes, 735–736, 736f Interferon-␣ (IFN-␣), in immune response, 701f, 703 Interferon-␤ (IFN-␤), in immune response, 703 Interferons in autoimmunity, 726HP in immune response, 701f, 703, 708, 709t Interkinesis, 608 Interleukins (ILs), 708, 709t in autoimmunity, 726HP in signal transduction pathways of lymphocyte activation, 724 Intermediate filaments (IFs), 324, 325t, 354–356 AFM measurement of, 329, 329f assembly and disassembly, 354–356, 355f axonal, 333f, 356 compared with microfilaments and microtubules, 354 functions, 325f, 356 of hemidesmosomes, 249–250, 250f of nuclear lamina, 489f, 489-490 organization of, 357f related disorders, 356 types, 356 types of, 354t Intermembrane space, 180, 182, 659f, 660 Internal energy (E), 87–88, 88f Internalization signal, in endocytosis, 312, 312fn, 320EP, 624–625 Internal ribosome entry site (IRES), 470fn, 536 Interphase, 573 Intervening sequences. See Introns Intestinal epithelial cells, 3–5, 4f glucose absorption, 160–161, 161f Intestinal microvilli, 3, 4f

Intracellular messengers Ca2⫹ as, 648 Ca2⫹-binding proteins, 651–652, 651t, 652f IP3 and voltage-gated calcium ion channels, 649 plant regulation of Ca2⫹ concentration, 652–653, 652f visualizing cytoplasmic Ca2⫹ concentration in real time, 649–651, 649f, 650f, 651f NO as, 655–656, 655f Intraflagellar transport (IFT), 350, 350f Introns, 407–408, 445–446, 445f, 454 acting as exons, 454 evolutionary impact, 454 in globin genes, 447f group I, 450 group II, 450, 451f pre-mRNA splicing compared with, 451, 453, 453f lariat formation, 450, 451f mRNAs and, 446f trinucleotide repeats and, 404HPf Intron-exon junctions, 449–450, 452f Inverted repeats, in DNA, 409, 410f Ions, 34 behavior in water, 36f conductance, 151, 152f diffusion across membranes, 151–156 Ion channels, 151–156. See also specific channels in acetylcholine receptor, 173EP–174EP, 174EPf conformational changes in, 136–137, 137f, 174EP, 174EPf defects in, associated diseases, 162HP–163HP, 162HPt inactivation, 155–156, 156f in stomata opening, 652 in synaptic transmission, 169, 169f Ion-exchange chromatography, 753, 753f Ion gradients cotransport energetics, 161 coupling to active transport, 160–161, 161f as form of energy, 189 across membranes, 148-149, 157, 157t in mitochondria, 189, 198, 204-205 voltage across membranes and, 165 Ionic bonds, 35–36, 36f in myoglobin, 59, 59f Ionization, 34 Ion–product constant of water, 40 Ion pumps P-type, 159, 159f V-type, 159–160 Ion transport systems, 157–161, 158f, 159f, 161f iPS cells. See Induced pluripotent stem cells IRES. See Internal ribosome entry site Iressa, 691 Iron regulatory protein (IRP), 537 Iron-response element (IRE), 537, 537f Iron-sulfur centers, 192, 192f, 223, 223f, 537f Iron-sulfur proteins, 192, 192f of electron transport chain, 192 of photosystem I, 223, 223f Irritability, 164 IRSs. See Insulin-receptor substrates Isocitrate dehydrogenase, in cancer development, 681 Isoelectric focusing, 757 Isoelectric point, protein separation based on, 753 Isoforms, of proteins, 76, 407 Isolation, of proteins, 752 liquid column chromatography, 753–756, 753f, 754f, 755f PAGE, 756–757, 756f, 757f

protein measurement and analysis techniques, 757–758, 758f, 758fn selective precipitation, 752–753 Isoleucine (Ile, I), 52f, 53 Isomers, 44–45 Isoprenoid unit, 191f Isopycnic centrifugation. See Equilibrium centrifugation Isotonic (isosmotic) solution, 150, 150f Isotopes, radioactive, 748–749, 749t, 750f J JAK–STAT pathway, 724 Janus kinases, in signal transduction pathways of lymphocyte activation, 724 J genes, 713–716, 714f, 715f Joule, 33fn “Jumping” genes, 408–410, 409f, 410f Junctional complex, 257, 258f Junctions between cells, 257, 258f types, 257 K Kaposi’s sarcoma, 626HP, 668 Kappa (␬) chains of immunoglobulins, 710t, 711–712 genes encoding, 713–716, 714f, 715f Kartagener syndrome, 349HP Karyotypes, 502, 503f of Down syndrome, 609HP of cancer cells, 667f KDEL receptor, 298, 298f KDEL retrieval signal, 298, 298f Keratins, 354t, 356 conformation of, 55, 56f Keratin filaments, 249–250, 354–356, 356f Ketoses, 43, 43f Kidney glomerular basement membrane, 237–238, 238f Alport syndrome and, 240 tubules, and tight junctions, 262 Kidney cancer genes associated with, 673t, 675f incidence and mortality of, 665f Kilocalorie, 33fn Kinases. See Protein kinases Kinesins, 335f, 336f, 337f, 338–339 conformational changes, 334, 335f depolymerizing, 595–596, 595f direction of movement, 334–335, 337f in intraflagellar transport, 350, 350f in kinetochore, 585-586, 586f in mitosis, 588-589, 597, 598f in motility assays, 328–329, 328f movement along microtubules, 334–337, 335f Kinesin-related proteins (KRPs), 334–336 Kinetics of enzymes, 102–105, 105f thermodynamics compared with, 95, 192–193 Kinetochores, 509, 585, 586f In anaphase, 594–596 in prometaphase, 588–589 in spindle assembly checkpoint, 596, 596f Kingdoms, 28EPt, 29EP Klinefelter syndrome, 609HP Knockout mice, 336f, 579f, 778–780, 779f, 779fn KRAS gene, in colon cancer, 683–684, 683f, 684f Kupffer cell, 304, 304f “Kuru,” 66HP

INDEX I-19 L L1 cell-adhesion molecule, 252f, 253 L1-deficiency disorders, 253 L1 repeated DNA sequences, 403, 410 lac operon, 485–487, 486f, 487f attenuation, 486 catabolite repression, 486 ␤-Lactamase, 107HP–108HP Lactose, 45 bacterial culture in, 484, 484f Lactose tolerance, 419HP–420HP Lagging strand during replication, 552-554, 552f, 555f, 561-662, 562f Lambda (␭) chains of immunoglobulins, 710t, 711–712 Lambda phage, eukaryotic DNA cloning in, 768–769, 769f Lamellipodia, 375f, 376–379, 376f, 377f, 378f, 380f, 381 Lamins, 490 Laminins, 239f, 243–244, 260, 361 Lamin proteins, 354t, 489-490 Laser scanning confocal microscopy, 739–740, 739f Late endosomes, 312, 313f Lateral gene transfer (LGT), 29EP LDL. See Low-density lipoprotein Leading edge of motile cell, 376f, 377, 377f, 378f Leading strand during replication, 552, 552f, 555f, 561-662, 562f Leaflets of membranes, 123, 124f asymmetry, 139, 285, 285f size, 127–128 Leaves of C4 plants, 231f, 232 stomata of, 232 functional organization, 213f Lectins, 259f Lens of light microscopes, 733–734, 733f of TEMs, 741–742, 742f Leptotene, 604, 605f let-7 miRNA, 459 in cancer development 669 Leucine (Leu, L), 52f, 53 Leucine zipper motif, 522 Leukemias development of, 669 gene expression profiles of, 685–686, 686f genes associated with, 680–681 immunotherapy for, 688–689 incidence and mortality of, 665f inhibition of cancer-promoting proteins in, 690 miRNA role in, 681 Leukocytes, in inflammation, 255HP–256HP, 255HPf Leukocyte adhesion deficiency (LAD), 255HP–256HP Lever-arm hypothesis of myosin action, 369, 369f Libraries DNA, 773 cDNA libraries, 773–775, 775f genomic libraries, 773–774, 775f of organic compounds, for drug design, 75, 75f of peptides, 72 of siRNAs, 457, 780, 781f Li-Fraumeni syndrome, tumor-suppressor genes in, 673t, 675 Ligands binding to integrins, 245–246, 245t, 246f binding to selectins, 252f in cell signaling, 618 in endocytosis, 308, 310f, 312, 313f as external stimuli, 122

message-carrying, 312, 621 types, 308 Ligand-gated channels, 152 in cell signaling, 621 in synaptic transmission, 168-171, 171HP-174HP Light absorption, in photosynthesis, 216–219 by antenna pigment, 220f, 222, 223f by photosynthetic unit, 218, 218f Light chains, of immunoglobulins, 710–712, 710t, 711f, 713f genes encoding, 713–716, 714f, 715f Light energy, use in active transport, 160, 160f Light-harvesting antenna, 218 Light-harvesting complexes LHCI, 223, 223f LHCII, 220, 220f Light-independent (dark) photosynthetic reactions, 216, 227f, 228f, 229 Light microscopes, 733, 733f bright-field microscopes, 735 specimen preparation for, 735 fluorescence microscopy, 736–740, 737f, 738f, 739f, 740f laser scanning confocal microscopy, 739–740, 739f phase-contrast microscopes, 735–736, 736f resolution of, 733–734, 734f super-resolution fluorescence microscopy, 740, 740f video microscopy and image processing, 738–739 visibility with, 734–735, 735f Light reactions, in photosynthesis, 218-225 Lignin, of plant cell wall, 268 Limit of resolution, of light microscopes, 734, 740 Linen, 47 LINEs (long interspersed elements), 403 L1 sequences, 410 Lineweaver-Burk plot, 103, 103f Linezolid, 106HP–107HP Linkages. See also Bonds of genes, 389 incomplete, 390–391 in RNA and DNA strands, 77–78, 77f Linkage groups, 389 Linker DNA, between nucleosome core particles, 495 Linker histone (H1), 495 Linseed oil, 48–49, 48f Lipids, 47–49 fats, 47–49, 48f phospholipids, 49, 49f precursors of, 42 steroids, 49, 49f synthesis in the ER of, 285, 286f Lipid-anchored membrane proteins, 130, 131f, 137–138 Lipid bilayers, 120f effects of temperature, 138–139, 138f fluidity, 120f factors affecting, 138–139 freeze-fracturing, 132, 132f fusion, 302 of membranes, 123–124, 123f, 124f, 146f asymmetry of, 128–129, 129f incorporation of proteins, 283, 284f, 285f movement of substances through, 148f, 149 nature and importance, 127–128, 127t size of, 19f Lipid envelope of virus, 24f Lipid hydrolyzing enzymes, 304t, 628f, 629-630 Lipid rafts, 124f, 139–140, 140f, 723 Lipofection, 776 Lipofuscin granule, 305, 305f

Lipopolysaccharide (LPS), immune response to, 701 Lipoproteins, 313–315, 314f Alzheimer’s disease and, 417HP–418HP Liposomes, 128, 129f in study of vesicle formation, 276, 277f in study of vesicle fusion, 302 Liquid column chromatography, 753 affinity chromatography, 754, 754f gel filtration chromatography, 754, 754f ion-exchange chromatography, 753, 753f techniques for determining protein–protein interactions, 754–756, 754f, 755f Liquid scintillation spectrometry, 749 Live-cell imaging, 326–330 Liver cancer, 668 genes associated with, 675f lncRNA. See long noncoding RNA Locomotion. See also Cell locomotion ciliary and flagellar mechanism, 351–352, 352f role of dynein, 350–351, 351f sliding-microtubule theory, 353, 353f of kinesins, 335–336, 335f membrane motility, 128f Loci of genes on chromosome, 391 Longevity cell signaling role in, 647HP–648HP, 647HPf cholesterol levels and, 314-315, 319EP Long noncoding RNA (lncRNA), 461 as transcriptional repressors, 530f, 532–533, 533f in X-chromosome inactivation, 499 Long QT syndrome, iPS cells for, 22HP Long-term potentiation (LTP) of synaptic function, 171 Lorenzo’s Oil, 208HP–209HP Loss-of-function mutations, 405HP in GPCRs, 626HP in tumor-suppressor genes, 671f Low-density lipoprotein (LDL) atherosclerosis and, 314, 314f cholesterol metabolism and, 313–315, 319EP–320EP, 320EPf in endocytic pathway, 313, 319EP–321EP, 321EPf high-density lipoproteins compared with, 314 Low-density lipoprotein (LDL) receptors, 313–314, 313f, 320EP, 320EPf, 417HP Low-energy electrons, 183fn, 187, 215f Lung cancer development of, 669 DNA repair defects, 569HP genes associated with, 673, 675f incidence and mortality of, 665f inhibition of cancer-promoting proteins in, 691 smoking role in, 668 Lymph nodes, 700f T cell activation in, 708, 708f Lymphocytes, 701f, 703 B cells in autoimmune disease, 724HP, 726HP clonal selection theory applied to, 704–707, 704f, 705f, 706f DNA rearrangements producing genes encoding antigen receptors of, 713–716, 714f, 715f in immune response, 701f, 703, 703f memory, 705–706 selection of, 721fn TH cell activation of, 722f, 723 TH cell interaction with, 709–710, 709f, 709fn cell-surface signals activating, 722–723, 722f signal transduction pathways in activation of, 723–724 size of, 19f

I-20 INDEX T cells activation and mechanism of action of, 707–710, 707f, 708f, 709f, 709fn APC activation of, 722–723, 722f APC interactions with, 717, 718f, 720–721 in autoimmune disease, 724HP, 726HP B cell activation by, 722f, 723 cancer immunotherapy using, 689 cytotoxic, 709, 718f, 720–721 division of, 699f DNA rearrangements producing genes encoding antigen receptors of, 713–716, 714f, 715f helper, 709–710, 709f, 709fn, 718f, 720–723, 722f in immune response, 701f, 703, 703f memory, 708 MHC interaction with, 727EP–728EP, 727EPt, 728EPt regulatory, 710 selection of, 721, 721f Lymphoid organs, 699, 700f Lymphomas, 668 genes associated with, 673t, 680–681 immunotherapy for, 688 incidence and mortality of, 665f miRNA role in, 682 Lysine (Lys, K), 51–53, 52f, 53f Lysosomal enzymes (lysosomal proteins) addition of recognition signals, 299 in endocytic pathway, 313f from endoplasmic reticulum to lysosomes, 299f in lysosomal disorders, 306HP phosphorylation in Golgi complex, 299, 299f sorting and transport, 299–300, 299f synthesis on ribosomes, 282–283, 282f targeting to lysosomes, 299f transport from TGN, 299, 299f, 300, 313f Lysosomal storage disorders, 306HP–307HP, 306HPf Lysosomes, 8f–9f, 11f, 272f, 303–305, 304f In antigen processing, 719f, 720, 727EP autophagy in, 304–305, 305f defects in function, 306HP–307HP dynein transport of, 338 enzymes of, 303–304, 304t origin of, 27EP transport of, 337f uptake of lysosomal enzymes, 299f in vesicular transport, 296f Lytic viral infection, 25 M Macromolecules, 41 precursors for, 41–42 Macrophages, 304, 314f, 315, 374, 375f, 700, 701f in lysosomal storage disorders, 307HP Macular degeneration, 418HP, 459HP–460HP Mad2 protein, 596, 596f “Mad cow disease,” 66HP Magnesium, 216, 262, 551f Magnification, resolution and, 733–734, 734f Major histocompatibility complex (MHC), 716–721, 717f, 718f, 719f, 720f in antigen presentation, 727EP–730EP, 727EPt, 728EPt, 729EPf, 730EPf in autoimmunity, 725HP class II molecules, 717, 718f, 719f, 720–721, 720f class I molecules, 717–720, 718f, 719f, 720f structure of, 728EP, 729EPf T cell recognition of, 721, 721f Malate-aspartate shuttle, 186 Malate dehydrogenase, 76, 77f Malignant tumors. See Cancer

Mammalian embryos, DNA transfer into, 775–778, 776f, 777f embryonic stem cells from, 21 Mammary gland cells, differentiation, 260 Manganese (Mn) ions, in photosynthesis, 220f, 221–222 Mannose 6-phosphate, in lysosomal disorders, 306HP Mannose 6-phosphate receptors (MPRs), 299–300, 299f MAP kinase adaption of, 643–644, 643f RTK activation of, 640–644, 641f, 642f, 643f Mapping of genes on chromosomes, 391–392 of haplotypes, 419HP–420HP MAPs. See Microtubule-associated proteins Margarine, 49 Marker chromosome, 509 Mass action law, 90 Massively parallel sequencing, 420, 514, 524, 773 Mass spectrometry, 757–758, 758f, 758fn in protein identification, 71–73, 72f Matrix, of mitochondria, 180, 181f, 182 Matrix-assisted laser desorption ionization (MALDI), 758 Matrix metalloproteinases (MMPs), 244 cancer and, 256HP–257HP extracellular protein degradation by, 244 Maturation model of Golgi movement, 293–294, 294f Maturation of oocyte, 611EP, 611EPf Maturation-promoting factor (MPF), 575, 575f, 607, 611EP–614EP cell cycle and, 612EP, 612EPf oocyte maturation and, 611EP, 611EPf relation to cdc2 kinase, 614EP relation to cyclins, 613EP–614EP Maximal velocity of enzyme reaction (Vmax), 103 MCM proteins, in DNA replication, 560–561, 560f MDM2, in cancer development, 676–677, 679f, 682f MD simulations. See Molecular dynamic simulations Mechano-gated channels, 152 Media, for cell cultures, 750 Medical diagnosis screening tests for cancer, 72–73, 693 protein microarrays in, 72–73 serum proteomics, 72 Medical therapies antibiotics to treat infections, 106HP–108HP. 107HPt for Alzheimer’s disease, 68HP–69HP for autoimmune diseases, 725HP-726H bacteriophages for infections, 26 bone marrow transplants, 20HP, 209HP, 307HP, 726HP for cancer, 687–693 for cystic fibrosis, 162HP–163HP cell replacement therapy, 20HP–23HP DNA delivery systems, 163HP for lysosomal storage disorders, 307HP nanomachines in cancer, 328 RNA interference, 458HP–459HP for stroke, 246, 247f for X-ALD, 209HP Medications anti-inflammatory selectin inhibitors, 255HP based on protein structure, 75–76, 75f for cancer matrix metalloproteinase (MMP) inhibitors, 256HP–257HP

mitotic spindle inhibitors, 341 preventive drugs, 669 RNA interference therapy, 458HP–459HP topoisomerase II inhibitors, 398 competitive enzyme inhibitors, 104 customizing via genetic profile, 419HP for cystic fibrosis, nonsense-codon therapy, 475 cytochrome P450 enzymes and, 280 cytochrome P450 genes and, 419HP delivery by liposomes, 128, 129f drug development, 75f clinical trials, 75f preclinical trials, 75f effects at synapses, 170 for inducing weight-loss cannabinoid-blockers, 170 oxidation-phosphorylation uncouplers, 198–199 for infections. See Antibiotics for lysosomal disorders cord blood, 307HP enzyme replacement therapy, 307HP for muscular dystrophy, nonsense-codon therapy, 475 for opening tight junctions in blood-brain barrier, 262 for xeroderma pigmentosum, DNA repair enzymes, 568HP for preventing heart attacks or stroke anti-integrin agents, 246, 247f cholesterol-lowering statins, 314 cholesteryl ester transfer protein (CETP) inhibitors, 314–315 Meiosis, 602–611, 602f, 603f anaphase of, 607, 607f chromosome aberrations during, 504HP–505HP, 504HPf chromosome number and, 389, 389f crossing over during, 390, 391f, 606, 606f unequal, 407, 407f delayed completion in females, meiotic nondisjunction and, 609HP in eukaryotes, 603, 603f genetic recombination during, 610–611, 610f metaphase of, 607, 607f mistakes during, 608HP–609HP, 608HPf mitosis compared with, 602–603 prophase of, 603–604, 604f, 605f diakinesis, 607 diplotene, 606–607, 606f leptotene, 604, 604f, 605f pachytene, 606, 606f zygotene, 604, 604f, 605f sporic, 603, 603f stages of, 602–605 premeiotic S phase, 603–604 telophase of, 607f, 608 terminal, 603, 603f zygotic, 603, 603f Meiotic nondisjunction, 608HP–609HP, 608HPf Meiotic spindles, 609HP Melanoma, 569HP genes associated with, 673t, 675f, 680 immunotherapy for, 688 incidence and mortality of, 665f inhibition of cancer-promoting proteins in, 691, 691f metastasis of, 665f Melanosomes, 364, 364f Membranes asymmetry of, 128–130 bacterial compared with mitochondrial, 182, 182f carbohydrates in, 124f, 129–130

INDEX I-21 chemical composition, 124f, 125–130 in cyanobacteria, 14f chloroplasts and, 14 cytoplasmic, 10, 270–323 depolarization, 155 dynamic nature, 125, 140–147 eukaryotic compared with prokaryotic, 10 evolution of, 26EP–27EP flow through cell, 285, 285f fluidity, 138–140 maintenance, 139 functions, 121–122, 122f, 147 ion gradients across, 157, 157t leaflets of, 123 lipid bilayers of, 123–124, 123f, 124f asymmetry of, 128–129, 129f nature and importance, 127–128, 127t oligosaccharides in, 45 permeability, 149, 149f semipermeable, 149 “sidedness” (asymmetry), 134, 134f, 139, 140f, 285, 285f synthesis, 285 modifications, 286f three-layered structure, 120–121, 121f of vesicle and target, fusion, 302, 303f water role in, 38 Membrane domains, cell polarity and, 144 Membrane lipids, 124f, 125–129, 126f, 127t asymmetry, 139, 140f chemical structure, 126f membrane fluidity and, 138–140 mobility, 140, 141f, 143–144 restrictions on, 144, 144f modifications, 285, 286f ratio to proteins, 125 synthesis, 285 transfer between membranes, 285, 286f Membrane potentials, 165, 166f at equilibrium, 166 gated ion channels and, 155 nerve impulses and, 164–171 Membrane proteins, 124f, 130–138. See also Integral membrane proteins classes, 130, 131f crystallization of, 133 detergent solubilization of, 132–133, 133f GPI-anchored, 131f homology modeling of, 133 as ion channels, 152 lipid-anchored, 130, 131f, 137–138 mobility of, 141–143, 141f of organelles, 130, 316–318 orientation of, 134–135, 134f, 135f, 284, 284f, 285f peripheral, 130, 131f, 137, 143 ratio to lipids, 125 Membrane receptors, 50. See also Cell surface receptors for neurotransmitters, 169, 169f in signal transduction, 122 Membrane trafficking, 270–323 from endoplasmic reticulum to Golgi complex, 290f, 296f, 298f through Golgi complex, 293 from Golgi complex to endoplasmic reticulum, 296f, 298 study via cell-free systems, 276 Membrane transporters, 50 Memory, immunologic, 705–706 Memory B cells, 705–706 Memory T cells, 708

Mendel’s laws of genetics, 388 in meiosis, 607 physical basis, 389 Mesothelioma, asbestos role in, 668 Messenger RNA (mRNA), 428, 428f 3’ poly(A) tail, 444, 444f, 449–450, 449f 5’ methylguanosine cap, 444, 444f, 449, 449f, 470 complementarity, 428 cytoplasmic localization of, 537–538, 538f destruction by RNA interference, 455f, 456f, 457 medical uses, 458HP–459HP discovery, 428 in elongation of nascent polypeptide, 471–474, 472f as information carriers, 428, 428f in initiation of protein synthesis, 468–471, 469f, 470fn interaction with tRNAs, 466, 467f introns and, 446f mutations and, 474–475 processing, 448–454, 450f coordination with transcription, 453f intermediates formed, 452f, 453f stability of, control of, 538–539, 539f structure, 444, 444f surveillance of, 474–475 synthesis and processing, 441–455 transcription, 441–444, 476, 476f preinitiation complex assembly, 442f, 443f in translation, 468–477, 469f, 471f, 472f simultaneous translation, 476 untranslated (noncoding) regions (UTRs), 444, 444f trinucleotide repeats and, 404HPf Messengers. See Extracellular messengers, second messengers Metabolic intermediates, 42, 108 Metabolic pathways, 42, 108, 109f Metabolic reactions. See also Biochemical reactions coupled, 92–93 free-energy changes in, 91–92 interrelatedness, 92, 230f, 231 Metabolism, 108–117, 109f anaerobic and aerobic, 188HP of carbohydrates, overview, 183f energy capture and utilization in, 110–115 feedback inhibition, 115, 116f “inborn errors,” 427 oxidation and reduction, 109–110 oxidative, 183–187 free radicals and, 35HP photosynthetic, 214–215 in prokaryotes, 13 as property of cell, 6 regulation, 115–117 steady-state, 93–94 equilibrium compared with, 93–94, 94f Metabolites, 42, 108 Metagenome, 15 Metaphase of meiosis, 607, 607f microtubule flux in, 590–592 of mitosis, 590–592, 591f Metaphase plate, 590, 591f Metastasis, 256HP–257HP, 256HPf, 664, 665f role of cell-adhesion molecules, 256HP–257HP Metastatic cells basement membrane and, 256HPf properties, 256HP

Metformin, cancer prevention with, 669 Methanogens, 14 Methicillin-resistant S. aureus (MRSA), 106HP Methionine (Met, M), 52f, 53 initiation codon and, 468–469 Methionyl-tRNAs, 469, 469f 3-Methyladenine, 567 N-Methyl-D-aspartate (NMDA) receptor, synaptic strengthening and, 170–171 Methyl group, 41t of rRNA, 437–438 Methylguanosine cap on mRNA, 444, 444f, 448, 469-470 on pre-mRNA, 449f MHC. See Major histocompatibility complex MHC class II molecules, 717, 718f, 719f, 720–721, 720f MHC class I molecules, 717–720, 718f, 719f, 720f Micelles, 48, 48f Michaelis constant (KM), 103–104, 103f Michaelis-Menten equation, 103 Michaelis-Menten relationship, 102, 103f Microarrays for gene-expression analysis, 514-517, 524f, 685–687, 686f, 687f of proteins (protein chips), 72 Microbiome, human, 15 Microfibrils, cellulose, 266 Microfilaments, 324, 356–364. See also Actin filaments of animal cell, 8f–9f assembly and disassembly, 358–360 equilibrium between, 359–360 effects of disruption, 360 functions, 325f intermediate filaments compared with, 354 connected to, 356, 357f motility and, 356 size of, 17 in vesicular transport, 363–364, 364f Micrometer (␮m), 17 MicroRNA (miRNA), 456f, 459–460, 459f in cancer development, 681–683 role in development, 459–460 in translation control, 539–540, 540f Microsatellite DNAs, 402 Microscopes, early models, 2, 2f Microscopy AFM, 748, 748f light microscopes, 733, 733f bright-field microscopes, 735 fluorescence microscopy, 736–740, 737f, 738f, 739f, 740f laser scanning confocal microscopy, 739–740, 739f phase-contrast microscopes, 735–736, 736f resolution of, 733–734, 734f super-resolution fluorescence microscopy, 740, 740f video microscopy and image processing, 738–739 visibility with, 734–735, 735f SEM, 740–741, 746–748, 747f specimen preparation for, 747 TEM, 740–742, 741f, 742f specimen preparation for, 742–746, 743f, 744f, 745f, 746f Microsomes, 275 isolation for study, 275–276, 276f, 752

I-22 INDEX Microtubular motors (microtubule-associated motors) dyneins as, 337–339, 337f kinesins as, 334–339, 337f plus end-directed, 335 minus end-directed compared with, 335 Microtubules, 178f, 179f, 324, 324f, 330–353 assembly, 338f, 339–341, 339f in vitro, 342–343, 343f astral in metaphase, 590 in prophase, 586–587, 587f of axonemes, 346, 347f, 348f sliding, 351–352, 352f of basal body, 348f A and B tubules, 346, 347f changes during cell cycle, 341–342, 341f chromosomal, 590 cortical, 331f, 332 depolymerization of, in anaphase, 594–595, 595f disassembly and reassembly, 342 in vitro studies, 342–344 dynamic behavior, 327f, 341–345, 341f, 343f, 344f studies of, 342–345 tubulin dimers and, 342–344, 343f elongation, 339 in eukaryotic cells, 8f–9f, 12, 12f FRAP study of, 329–330, 330f functions of, 325f, 332–334 intermediate filaments compared with, 354 connected to, 356, 357f intracellular motility and, 333–334 role of cytoplasmic dynein, 337–339, 337f role of kinesins, 334–339, 335f, 337f minus ends, 330–331, 340 in nucleation, 340f, 341 in motility assays, 328, 328f in neural tube formation, 381, 382f nucleation, 338–341, 339f, 340f, 342f in plant cells, 341–342, 342f plus ends, 330–331, 340, 345f dynamic behavior and, 342–344, 343f, 344f polarity of, 330, 334, 340, 590 establishment of, 340f, 341 in prometaphase, 590 properties, 325t protofilaments, 330–331, 331f role of centrosomes, 339–340, 339f size of, 17 as structural support, 331f, 332–333, 332f structure and composition, 330–331, 331f syntelic attachments of, 596–597, 596f ⫹TIPs, 345, 345f in transport, 326f, 333f, 334, 337f, 363–364, 364f direction of, 334–335 kinesins in, 334–337, 335f, 336f Microtubule-associated proteins (MAPs), 331–332, 331f Microtubule-organizing centers (MTOCs), 339–341, 339f Microvilli, intestinal, 3, 4f actin filaments in, 374f Midbody, 598, 598f Minisatellite DNAs, 401–402 Minus (slow-growing) end of actin filament, 358, 358f, 359f of microtubule, 330–331, 340 in nucleation, 340f, 341 miRNA. See MicroRNA Misfolded proteins, 66HP–70HP, 288 Mismatched base pairs, colon cancer and, 569HP

Mismatch repair (MMR), 567 defects in, 569HP mutations and, 569HP Mitochondria, 4f, 5, 8f–9f, 10, 11f, 178f, 179–198, 179f, 181f abnormalities, 207HP–209HP, 207HPf, 208HPf aerobic respiration, 205f in apoptosis, 659f, 660 in ATP formation, 189–197 autophagy of, 305f calcium ion storage, 180, 648fn chloroplasts and, 213–215 in differentiated cells, 16–17 disappearance during evolution, 27EPfn electron transport, 186, 187f, 193–197, 194f endosymbiont origins, 26EP–27EP, 27EPf, 29EP energy storage and use, 189 fusion and fission of, 180f in heart muscle, 188HP kinesin-mediated transport, 336f, 337 matrix, 180, 181f, 182, 316, 317f membranes, 180–182 bacterial membranes and, 182f inner, 180, 181f, 182, 316–318, 317f intermembrane space, 182 matrix, 180, 182 outer, 180, 181f, 182, 316, 317f permeability, 182 membrane lipids, transfer from endoplasmic reticulum, 285 in muscle fibers, 188HP in photorespiration, 230f, 231 proteins sites of synthesis, 316 uptake of, 316–318, 317f proton-motive force and, 198–199, 205 proton transport, 187, 187f, 194f respiratory rate, 205 size of, 17, 19f structure and function, 179–182 in TCA cycle, 186–187 Mitochondrial chaperones, 81EP Mitochondrial DNA (mtDNA), 181f, 182 mutations, 207HP–208HP, 208HPf Mitosis, 12, 12f, 573, 573f, 582f anaphase, 592–597 chromosome movements at, 594–596, 595f events of, 592–594, 594f proteolysis in, 592, 593f spindle assembly checkpoint, 596–597, 596f chromosomes in, 493–494 structure of, 501–502, 503f cytokinesis with, 597–602, 598f, 599f, 600f meiosis compared with, 602–603 metaphase, 590–592, 591f mitotic chromosome formation, 583–586 motor proteins in, 597, 598f nuclear lamina in, 490 origin of, 26EP–27EP prometaphase, 588–590, 589f, 590f, 591f prophase, 582f, 583–588, 583f, 584f mitotic spindle formation, 586–588, 587f, 588f nuclear envelope and organelles in, 588 stages of, 582f telophase, 597, 597f Mitotic chromosomes, 497, 497f formation of, 583–586 Mitotic spindle, 12, 12f, 572f assembly checkpoint for, 596–597, 596f cancer chemotherapy and, 341 cytoskeleton in, 345

formation of, 586–588, 587f centrosome duplication, 586–587, 587f without centrosomes, 588, 588f microtubules in, 341f, 342, 345 M line of sarcomere, 366, 366f Mobile genetic elements, 408–411, 480-481 role in evolution, 410–411 Mobility within membrane, 124-125 of membrane lipids, 140, 141f, 143–144, 144f restrictions on, 144 of membrane proteins, 141–143, 141f, 143f Model organisms, 17, 18f. See also Animal models Mole, atoms in a, 33fn Molecular biology techniques, 732–783 AFM, 748, 748f antibodies used in, 780–783, 782f cell cultures, 749–751, 751f chemical synthesis of DNA and RNA, 764 differential centrifugation, 752, 752f DNA libraries, 773 cDNA libraries, 773–775, 775f genomic libraries, 773–774, 775f DNA sequencing, 771–773, 772f DNA transfer into eukaryotic cells and mammalian embryos, 775–778, 776f, 777f genes elimination/silencing knockout mice, 778–780, 779f, 779fn RNA interference, 780, 781f in vitro mutagenesis, 778 isolation, purification, and fractionation of proteins, 752 liquid column chromatography, 753–756, 753f, 754f, 755f PAGE, 756–757, 756f, 757f protein measurement and analysis techniques, 757–758, 758f, 758fn selective precipitation, 752–753 light microscopy, 733, 733f bright-field microscopes, 735 fluorescence microscopy, 736–740, 737f, 738f, 739f, 740f laser scanning confocal microscopy, 739–740, 739f phase-contrast microscopes, 735–736, 736f resolution of, 733–734, 734f super-resolution fluorescence microscopy, 740, 740f video microscopy and image processing, 738–739 visibility with, 734–735, 735f nucleic acid fractionation, 760 by gel electrophoresis, 760, 761f by ultracentrifugation, 760–762, 762f nucleic acid hybridization, 762–764, 763f PCR, 769–771, 770f radioisotopes, 748–749, 749t, 750f recombinant DNA technology, 764 DNA cloning in, 766–769, 767f, 768f, 769f recombinant DNA formation in, 766, 766f restriction endonucleases in, 764–766, 765f SEM, 740–741, 746–748, 747f specimen preparation for, 747 structure determination of proteins and multisubunit complexes, 758–760, 758f, 759f, 760f TEM, 740–742, 741f, 742f specimen preparation for, 742–746, 743f, 744f, 745f, 746f Molecular chaperones of Hsp70 family, 65 in protein folding, 65, 65f Molecular dynamics (MD) simulations, 60, 60f, 65f, 151f

INDEX I-23 Molecular motors, 50, 203–204, 334 for actin filaments, 360–364 for microtubules, 334-339, 350-353, 597 observing by video microscopy, 328, 328f, 362f Moles, 670–671, 677 Monoclonal antibodies cancer therapy using, 688, 693 molecular biology techniques using, 782–783, 782f Monogalactosyl diacylglycerol, 214 Monomers, in macromolecules, 41, 42f Monosaccharides, 43 Monosomies, 608HP–609HP Monounsaturated fatty acids, 126 Morphogenesis, 254, 254f, 381, 382f Motifs, of transcription factors, 519–522 HLH, 521, 521f leucine zipper, 522 zinc-finger, 520–521, 520f Motile cells, 374–379, 376f, 377f, 378f. See also Cell locomotion cytoskeleton in, 326, 326f steps in movement, 375–376, 376f, 378f Motility of cells, 356, 378f of cell components, 356 cytoskeleton and, 324–383 driven by actin polymerization, 374, 375f, 377 intracellular, role of microtubules, 333–334 microfilament-related, 356, 358 of nonmuscle cells, 371–381, 382f nonprocessive, 369 processive, 334, 362–363, 363f role of actin filaments, 358 sliding-microtubule theory, 353f Motility assays, in cytoskeleton studies, 328, 328f, 335, 362–363, 362f, 363f Motor neurons, 333–334, 370, 371f Motor proteins, 6, 325f, See also Molecular motors in axonal transport, 333f, 334 on chromosomes, 597, 597f conformational changes, 334 in intraflagellar transport, 348f in mitosis, 597, 598f in motility assays, 328, 328f movement along microtubules, 333f, 334, 597 plus compared with minus end-directed, 335 polarity and, 334 Motor units, 370 Movement proteins, in plants, 265, 265f M phase. See Mitosis mRNA. See Messenger RNA MRSA. See Methicillin-resistant S. aureus Multigene families, 405, 407, 408f Multiple sclerosis (MS), autoimmunity and, 724HP–726HP Multipotent cells, 20HP, 669, 670f, 703, 703f Multisubunit complexes, structure determination of, 758–760, 758f, 759f, 760f Multivesicular bodies (MVBs), 312–313 Muscarinic acetylcholine receptors, 172EPfn Muscles, striated, 366 Muscle cells fermentation in, 114, 114f role of desmin, 356 Muscle contraction, 364–371 energetics of, 369–370 energy sources, 188HP excitation-contraction coupling, 370–371 molecular basis, 367f, 369, 369f role of tropomyosin, 370–371, 371f

sliding filament model, 366–371, 368f smooth endoplasmic reticulum and, 280 Muscle fibers, 366, 366f, 371f Muscle tissue desmin-related disorders, 356 mitochondrial abnormalities, 208HP Muscular dystrophy, 147, 475, 490 Mus musculus, 18f Mutagenesis in Drosophila research, 392 site-directed, 74–75, 135-137, 778 Mutagenic compounds, 667–668 Mutants, 277–278, 390 temperature-sensitive, 275, 275f, 549, 576 use in research, 275, 275f, 277–278, 392 Mutations, 390 in brain tumors, 684 in breast cancer, 685 of cancer cells, 669–670, 670f oncogene mutations, 671–672, 671f, 672f, 679–681, 679f, 682f tumor-suppressor gene mutations, 671–679, 671f, 672f, 673t, 674f, 675f, 676f, 677f, 679f in colon cancer, 683–684, 683f, 684f from DNA damage, 568HP–569HP of DNA repair genes, 681 DNA replication and, 557–558, 557f, 558f in Drosophila melanogaster, 390f evolutionary relationships and, 27EP–28EP frameshift, 473–474 gain-of-function, 404HP gene divergence and, 408, 408f gene duplications and, 407–408, 408f genetic code and, 463–464, 464f of GPCRs, 625HP–626HP, 625HPf, 626HPt human diseases from, 208HP–209HP, 208HPf “jumping” genetic elements and, 409 in lamins, 490 loss-of-function, 405HP in mismatch repair system, 569HP mitochondrial, 207HP–208HP nonsense, 474 in pancreatic cancer, 684, 685t in protein primary structure, 55, 55f pseudogenes and, 408 of RB gene, 673, 673t, 674f termination codons and, 474–475 of TP53 gene, 675–677, 675f, 676f, 677f, 679f trinucleotide repeats and, 404HP use in research, 74, 277–278, 277f viral, 25 Mutator phenotype, in cancer, 681 MYC oncogene, 680, 682f Mycoplasma, 14 Myelin sheaths, 125f, 164, 164f, 167, 168f from ES cells, 21HP Myoblasts, 366 MyoD, as master regulatory factor, 518 Myofibrils, 366, 370–371, 371f Myoglobin conformation of, 55, 56f evolution of, 408 function of, 58 size of, 17, 19f tertiary structure of, 58–59, 58f time-resolved X-ray crystallography, 103f Myosins, 360–364 in actin decoration, 375f in cell locomotion, 378, 380f conformational changes in, 60

in cytokinesis, 599–600, 600f head (motor) domain, 360, 361f, 369 in sarcomere, 368, 369f size of, 17 structure, 360 types, 360 unconventional, 360, 362–364, 363f, 364f stereocilia of inner ear and, 364, 365f Myosin cross-bridges, 367f, 368f Myosin I, 362, 373f Myosin II (conventional myosin), 360–362 antibodies of, 599 in cell locomotion, 378 filaments, 362, 362f bipolarity, 362, 362f functions, 360, 361f head and neck domains, 360, 361f, 369 interaction with actin, 360, 362f, 369–371, 371f, 378 in muscle contraction, 369, 369f energetics, 369–370 operation of, 369–370 S1 fragment, 361f, 369, 369f structure, 360, 361f tail domain, 361f, 362 Myotonic dystrophy, 404HPf N nAChR. See Nicotinic acetylcholine receptor NAD, NAD⫹, 113f in carbohydrate metabolism, 184f in fatty acid cycle, 186f in fermentation, 114, 114f in glycolysis, 183f, 184f in oxidation, 112, 112f regeneration, 183f, 188HP NADH, 113f in anaerobic oxidation, 114f in electron transport, 194f in fatty acid cycle, 186f in fermentation, 114, 114f in glycerol phosphate shuttle, 187f in glycolysis, 183, 184f, 185 in oxidative phosphorylation, 187f in redox reactions, 189–190 in TCA cycle, 186 NADH dehydrogenase, 194f, 195–196, 195f NAD⫹-NADH couple, 189–190 NADP, 113f NADPH, 113f electron transfer from, 114–115 formation by photosynthesis, 223, 223f, 225 photon requirements, 225 Na⫹/glucose cotransporter, 161, 161f Nanometer (nm), 17 Nanotechnology, 329 Nanotubes, tunneling, 264, 264f Natural killer (NK) cells cancer killed by, 702, 702f in immune response, 701f, 702, 703f Natural selection, 413fn conserved sequences and, 413, 413fn nucleotide changes and, 463–464 Ncd motor protein, 336 Ndc80 complexes, 585, 586f, 595f, 596 Negatively supercoiled (underwound) DNA, 398, 398f, 430f Negative selection, 413fn of T cells, 721, 721f Negative staining, for TEM, 744, 744f NER. See Nucleotide excision repair Nernst equation, 165, 165fn

I-24 INDEX Nerve gases, and neurotransmission, 170 Nerve impulses, 164, 167–171 membrane potentials and, 164–171 propagation (conduction), 167–168, 167f saltatory conduction, 167, 168f Nervous system development, 381, 382f cell-adhesion molecules in, 253, 254f neural crest migration, 242f, 243, 243f Nervous system tumors genes associated with, 673t incidence and mortality of, 665f Nestin, 354t Neural crest cells, embryonic migration, 242f, 243, 243f, 356 Neural plate formation, 381, 382f Neural tube formation, 254f, 381, 382f Neurodegeneration, autophagy for, 305 Neurodegenerative diseases neurofilaments in, 356 tau microtubule-associated protein and, 331 from trinucleotide expansion, 404HP–405HP, 404HPf from mitochondrial abnormalities, 208HP from myosin mutations, 364 protein conformation and, 66HP–69HP, 404HP-405HP synaptic dysfunction and, 171 Neurofibrillary tangles (NFTs), 331 AD and, 70H FTDP-17 and, 331-332 Neurofilaments, 356 in axonal transport, 333f Neuromuscular junctions, 168f, 169, 370 Neurons (nerve cells), 164, 164f. See also Axons axonal transport, 333–334, 333f cytoskeleton components, 325f dynein in, 338–339 in lysosomal storage disease, 306HPf saltatory conduction, 167, 168f Neurospora, 427–428, 427f Neurotransmitters, 168–171, 168f, 169f early studies, 171EP effects of toxins, 302 excitatory and inhibitory, 170 exocytosis of, 303, 303f reuptake, 170 SNARE docking proteins and, 302 Neurotransmitter receptors, 169–171, 1671HP–174HP, 621 Neutral evolution, 413fn Neutrophils, 255HP, 255HPf, 304, 315, 377fn Nexin (interdoublet) bridge of axoneme, 346, 347f, 352–353 NHEJ. See Nonhomologous end joining Nicotine, effects on muscle, 171EP Nicotinic acetylcholine receptor (nAChR), 172EP–175EP, 172EPfn isolation of, 172EP, 172EPf structure of, 173EP–174EP, 174EPf, 759 Niemann-Pick disease, 306HPt, 313–314 Nitric oxide (NO), as intercellular messenger, 655–656, 655f guanylyl cyclase activation, 656 phosphodiesterase inhibition, 656 Nitrogen electronegativity of, 34 hydrogen bonds with, 37f Nitrogen fixation, 14 of cyanobacteria, 14 Nitrogenous bases

in DNA, 393–394 of nucleotides, 77–78, 78f terminology, 394fn Nitroglycerine, 656 NK cells. See Natural killer cells NMD. See Nonsense-mediated decay NMDA receptor. See N-Methyl-D-aspartate receptor Nodes of Ranvier, 164f, 167, 168f Noncoding RNA (ncRNA), 461 See also Long noncoding RNA Noncompetitive enzyme inhibition, 105, 105f Noncovalent bonds, 34–38, 36f in multiprotein complex, 61, 61f in myoglobin, 58–59, 59f in polypeptide chains, 57 Noncyclic photophosphorylation, 226 Nondisjunction of chromosomes, 608HP–609HP, 608HPf Nonelectrolytes, 148 diffusion across membranes, 149 Nonfibrillar collagens, 240 Non-Hodgkin’s B-cell lymphoma, immunotherapy for, 688 Nonhomologous end joining (NHEJ), 567–568, 568f Nonpolar (hydrophobic) amino acids, 52f, 53, 134–135, 135f Nonpolar molecules, 34, 36f lipids, 47 Nonreceptor protein-tyrosine kinases, 636 Nonrepeated (single-copy) DNA sequences, 401, 405–406, 405f Nonsense-mediated decay (NMD), 474–475 Nonsense mutations, 474 in genetic code, 464f Nonsteroidal anti-inflammatory drugs (NSAIDs), cancer prevention with, 668 Nonsynonymous nucleotide changes, 464, 464f Nontranscribed spacer, 437, 437f Norepinephrine, 169 Northern blot, 763 N-terminus, of polypeptide chain, 51 Nuclear envelope, 8f–9f, 10, 488–493, 489f in mitosis, 588 origin of, 27EPf Nuclear export signals (NESs), 492 Nuclear lamina, 489–490, 489f Nuclear localization signal (NLS), 492 Nuclear magnetic resonance (NMR) spectroscopy, of proteins dynamic movements in, 59–60 tertiary structure, 57fn Nuclear pore complex (NPC), 490–492, 490f, 491f, 493f structure of, 491–492, 491f Nucleating proteins, of actin, 372, 373f Nucleation of actin filaments, 358–359, 359f role of nucleating proteins, 372 of microtubules, 338–341, 339f, 340f, 342f Nucleic acids, 77–79, 77f, 78f, 79f CJD and, 66HP early studies, 420EP–423EP fractionation of, 760 by gel electrophoresis, 760, 761f by ultracentrifugation, 760–762, 762f functions, 77 origin of term, 420EP precursors of, 42 terminology, 394fn of virus, 24f Nucleic acid hybridization, 400, 402–403, 762–764, 763f

Nucleoid, nucleus compared with, 10 Nucleoli, 8f–9f, 11f, 435, 435f, 488 events in, 436–437, 436f ribosomal assembly in, 440 Nucleoplasm, 8f–9f, 488, 488f Nucleoplasmin, 492, 493f Nucleoporins, 491 Nucleosides, 77, 393f terminology, 394fn Nucleosomes, 494–496, 494f, 494t DNA replication and, 562–564, 563f folding of, 496 placement of, 529–530, 529f structure of, 495, 495f Nucleosome core particle, 494 Nucleotides, 77. See also Base pairs in DNA, 393–394 5’ and 3’ ends, 393f, 394 early studies, 420EP–423EP polarity, 393f, 394 structure, 393–394, 393f Watson-Crick model, 386f, 396 in energy metabolism, 394fn functions of, 79 as precursors, 42 in RNA, 394fn structure, 77–78, 78f terminology, 394fn in transcription, 430f Nucleotide excision repair (NER), 565, 566f defects in, 568HP Nucleotide sequences changes in. See also Base substitutions nonsynonymous compared with synonymous, 464, 464f complementarity, 396 double-strand formation and, 400 evolutionary relationships and, 27EP–29EP, 28EPt relation to amino acid sequences, 396, 428 relation to genes, 396 species diversity and, 15, 396 triplets of genetic code, 404HP, 462, 464, 466 wobble hypothesis, 466, 467f Nucleus of cell, 8f–9f, 488–512 chromosomes and chromatin, 493–509 cloning and, 513 eukaryotic compared with prokaryotic, 10 nuclear envelope, 488–493, 489f nuclear pore complex, 490–492, 490f, 491f, 493f nucleoid compared with, 9 organization of, 510–512, 510f, 511f, 512f gene expression and, 510–511, 511f, 512f RNA transport, 492–493 size of, 17 structure and function of, 488–512, 488f Nude mice, 18f Numerical aperture (N.A.), 734 O O2•⫺, 35HP Obese individuals, bacteria in, 15 Obesity, cancer and, 668 Objective lens, 733–734, 733f of TEMs, 741–742, 742f Occludin, in tight junctions, 262, 262f Ocular lens, 733, 733f Odorant receptors, 634–635 Oils, 48–49, 48f Okazaki fragments, 552, 553f, 561, 562f Oligodendrocytes, 21HP, 167

INDEX I-25 Oligosaccharides, 45, 129, 130f assembly, 285–286, 287f, 292, 293f modification, 286, 292, 293f in plasma membrane, 124f, 129, 236 Omega-3 fatty acids, 126 Omnitarg, 688 Oncogenes, 458HP, 671–672, 671f, 672f, 679, 679f, 682f cancer therapy targeting, 689–692, 690t, 691f cell signaling by, 640–643 discovery of, 694EP–697EP, 694EPf, 694EPt, 695EPf, 696EPf encoding cytoplasmic protein kinases, 680 encoding growth factors or growth factor receptors, 680 encoding metabolic enzymes, 681 encoding products affecting apoptosis, 681 encoding proteins affecting epigenetic state of chromatin, 680–681 encoding transcription factors, 680 proto-oncogene activation into, 671–672, 671f, 672f One gene—many polypeptides concept, 454 One gene—one enzyme concept, 428 One gene—one polypeptide concept, 428 Oocytes cell division, 5, 5f delayed maturation, meiotic nondisjunction and, 609HP maturation of, 607, 611EP, 611EPf maturation-promoting factor and, 611EP nuclear transplants, for cell replacement therapy, 21HP, 21HPf primary, 603, 603f in prophase of meiosis, 604, 605f rDNA in, 436, 436f secondary, 608 size of, 436 Oogonia, 603, 603f Operator, of bacterial operon, 485, 485f Operon bacterial, 484–487, 484f, 485f lac, 485–487, 486f, 487f trp, 485, 486f inducible, 485 repressible, 485 Opiates, chronic use of, 633 Optical traps, 328, 431-432, 431f Optical tweezers, 143, 328, 328f Orbitals of electrons, 33f ORC. See Origin recognition complex Orencia, 726HP Organelles, 10–12, 11f, 270–271, 271f autophagy of, 304–305, 305f in differentiated cells, 16–17 of endomembrane system, 271 of eukaryotic cells, 7, 8f–9f, 10f fractionation studies, 275–276, 276f interdependence, 230f, 231 membrane proteins of, 130 without membranes, 10 in mitosis, 588 molecular chaperones and, 65 movement, role of myosins, 363f, 364 movement along microtubules, 326f, 334f dynein-mediated, 338–339 kinesin-mediated, 336–339, 336f, 337f possible origin, 26EP of prokaryotic cells, 7, 8f–9f proteomics studies, 275 in secretory cells, polarity, 280, 281f

self-assembly, 79–80 sizes of, 17, 19f Organic molecules, 40 Organization of cell, 3–5, 4f Origin of replication, 549, 559 Origin recognition complex (ORC), 559, 560f Oskar gene, 537–538, 538f Osmosis, 149, 150f Osteogenesis imperfecta, 240 Outer mitochondrial membrane (OMM), 180, 316, 317f “Outside-in” signaling, of integrins, 246, 259-260, 654f Ovarian cancer genes associated with, 675f, 678 incidence and mortality of, 665f inhibition of cancer-promoting proteins in, 692 Oxidation anaerobic, 113–114, 114f of carbohydrates, 183–185, 184f energy transfer in, 112f uncoupling from phosphorylation, 198–199 Oxidation-reduction (redox) reactions, 109 photosynthesis and, 214–215 standard free-energy change, 190 Oxidative metabolism, 183–187 Oxidative phosphorylation, 109f, 112, 187, 187f, 189-205 substrate-level phosphorylation compared with, 189 Oxidized state, 109–110 Oxidizing agents, 109, 189 during photosynthesis, 214–215, 221, 225 Oxidizing equivalents, 221–222, 222f 8-Oxoguanine, 567 Oxygen electronegativity of, 34 hemoglobin binding of, 60–61 hydrogen bonds with, 37f myoglobin and, 58 from photosynthesis, 214–215, 221–222, 222f Oxygen atoms reactive peroxy anion, 197 ultrareactive form (singlet oxygen [1O*]), 217 Oxygen/CO2 ratio, photorespiration and, 230 Oxygen-evolving complex, 220f, 221–222 Oxygenic (O2-releasing) photosynthesis, 8, 14, 212, 218–220, 219f P p21 gene, 580, 581, 580f, 676, 679f, 682f p53, 682f, 683, 684f in cell arrest and apoptosis, 580f, 675–677, 675f, 676f, 677f, 679f, 682f in senescence, 677–678, 679f, 682f P680 reaction-center chlorophyll, 219, 219f, 220f P680*, 219 P680⫹, 219, 221 P700 reaction-center chlorophyll, 219, 219f, 223, 223f P700*, 223 P700⫹, 223, 223f PABA. See p-Aminobenzoic acid Pachytene, 605f, 606, 606f Palindromes, 765 Pancreatic beta cells, reprogramming of, 23HP Pancreatic cancer genes associated with, 673t, 675f incidence and mortality of, 665f mutations in, 684, 685t Pancreatic secretory protein dynamics, 273, 274f Pap smears, 670, 670f

Paracellular pathway, and tight junctions, 260, 261f Paracrine signaling, 618, 618f Paramecium, size of, 19f Paraquat herbicide, 225 Parkinson’s disease cell replacement therapy for, 20HP mitochondrial dysfunction and, 208HP neurofilaments in, 356 Paroxysmal nocturnal hemoglobinuria, 137 PARP-1, cancer therapy targeting, 692 Partial trisomy, 505 Partition coefficient, 149, 149f Passive immunization, 69HP, 707 Passive immunotherapy, for cancer, 688 Passive transport, 148f Patch-clamp technique, 152f Paternity suits, and minisatellite DNAs, 402 Pathogens, types of, 700 Pathways, of protein folding, 64, 64f, 65f Pattern recognition receptors (PRRs), 700 PCM. See Pericentriolar material PCNA, 562, 562f PCR. See Polymerase chain reaction Pectins in plant cell wall, 266f, 268 Pemphigus vulgaris, 257 Penetrance, and disease, 417HP–418HP, 417HPfn Penicillin, 106HP–107HP Pentoses, 43 Pentose (5-carbon) sugars in nucleotides, 393f, 394, 394fn Peptide bonds formation during translation, 471, 472f, 473, 479EP in polypeptide chains, 51, 51f Peptide mass fingerprint, 71f, 72 Peptidyl transferase, 471, 478EP–479EP Peptidyl-tRNA, 470-474 Perception, GPCRs in, 634–636 Perforins, 709 Pericentriolar material (PCM), 338–339, 338f, 341, 586 Periodontal disease, 257HP Peripheral membrane proteins, 130, 131f, 143, 146f, 147, 373f Peristalsis, and gap junctions, 263–264 Permeability of tight junctions, 261–262 Peroxisomal targeting signals (PTSs), 316 Peroxisomes, 8f–9f, 206–207, 206f, 207f abnormalities, 208HP–209HP autophagy of, 305f kinesin-mediated transport, 337 microtubules and transport of, 326f in photorespiration, 231 protein uptake, 316 Peroxy anion, 197 Pertussis toxin, 627 PET scans, cancer cells in, 667 pH, 39–40 enzyme–catalyzed reaction and, 103–104, 104f of lysosome enzymes, 303–304 Phage. See Bacteriophages Phage display, 783 Phage genomes, eukaryotic DNA cloning in, 768–769, 769f Phagocytes (phagocytic cells), 315 in immune response, 700, 701f, 702 Phagocytic cells, 304, 305f Phagocytic pathway, 315f Phagocytosis, 270f, 308, 315, 315f Phase-contrast microscopes, 735–736, 736f Phase I clinical trial, in drug development, 68HP

I-26 INDEX Phase II clinical trial, in drug development, 68HP–69HP Phase III clinical trial, in drug development, 69HP PH domain in proteins, 59, 628f, 629 Phenylalanine (Phe, F), 52f, 53 2-Phenylaminopyrimidine, 75–76 Pheo. See Pheophytin Pheophytin (Pheo, Pheo⫺), 220–221, 220f pH gradient (⌬pH), 198, 198fn Philadelphia chromosome, 504HP 504HPf Phorbol esters, 630 Phosphatases. See Protein phosphatases Phosphate buffer system, 40 Phosphate groups, 41t in ATP formation, 111f, 112f, 117 in ATP hydrolysis, 93, 93f, 117 in glycolysis, 111–112, 111f, 184f low- compared with high-energy, 113f in membrane lipids, 126, 126f in nucleic acids, 77–78 in nucleotides, 77f, 393f, 394 nucleic acid terminology and, 394fn Phosphate transfer potential, 113, 113f Phosphatidic acid, 126, 127t Phosphatidylcholine (PC), 49, 49f, 126, 126f, 127t, 128–129, 129f Phosphatidylethanolamine (PE), 126, 127t, 128–129, 129f Phosphatidylinositol (PI), 126, 129, 129f Phosphatidylinositol-derived second messengers, 311–312, 627–629, 645f phosphorylation of, 628–629, 628f, 629f Phosphatidylinositol-specific phospholipase C-␤ (PLC␤), 629–630 Phosphatidylserine (PS), 127t, 128–129, 129f Phosphodiesterase, NO inhibition of, 656 3’, 5’-Phosphodiester bonds, 77f, 78, 393f, 394 Phosphoenolpyruvate, 111f, 116f Phosphoenolpyruvate carboxykinase (PEPCK), 116f, 522-523, 525 Phosphofructokinase, 116f, 117 2-Phosphoglycerate, 111f 3-Phosphoglycerate (3-PGA), 111f, 112, 112f carbon dioxide fixation and, 226–228, 227f in glycolysis, 111f, 112, 184f in photorespiration, 229f, 230f Phosphoglycerate kinase, 111f, 113 Phosphoglycerides, 126, 126f backbone of, 125–126, 125f, 126f groups linked to, 126 2-Phosphoglycolate, 229, 229f, 230f Phosphoinositides See Phosphatidylinositol-derived second messengers Phospholipase C, 59, 59f, 628f, 629–630, 629f, 630f, 631t Phospholipids, 49, 49f in low-density lipoproteins, 313, 314f in membranes, 123f, 125, 126f mobility, 140, 141f, 143–144, 144f modifications, 285, 286f synthesis, 285 in plasma membrane, 124f transfer between membranes, 285, 286f Phospholipid-transfer proteins, 285 Phosphors, 749 Phosphorylation. See also Protein-tyrosine phosphorylation of ADP. See ATP formation conformational changes in proteins and, 157–159, 158f, 202, 203f, 204–205 in glycolysis, 111–112, 111f

ion pumps and, 157–159, 158f of phosphatidylinositol-derived second messengers, 628–629, 628f, 629f of RNA polymerase II, 443, 443f to stop protein synthesis, 288, 289f of tau microtubule-associated protein, 331 of tyrosine, 620f, 636–638, 637f uncoupling from oxidation, 198–199 Phosphotyrosine-binding (PTB) domain, 638–640, 639f Phosphotyrosine-dependent protein–protein interactions, 638, 638f Photoactivatable GFP (PA-GFP), 740, 740f Photoautotrophs, 212 Photoinhibition, 222 Photolysis, and photosystem II, 219f Photons, 216 absorption, 216 in photosynthesis, 218, 218f, 219f, 220f, 221–222, 223f Photophosphorylation, 225–226, 226f Photorespiration, 229–231, 229f, 230f Photosynthesis, 211–234, 215f, 229f action spectrum, 217, 217f aerobic respiration compared with, 215f in bacteria, 8, 14, 212 in cyanobacteria, 8, 14, 14f, 27EP electron flow, 219f, 220–221, 224f as energy source, 5, 5f light absorption, 216–217 light-dependent reactions, 215 NADPH in, 114–115 overall light reaction, 225 overall reaction, 212, 214–215 overview, 212 oxygenic, 212, 218, 219f as redox reaction, 214–215 Photosynthetic bacteria, 212, 212f Photosynthetic pigments, 216–217, 217f Photosynthetic prokaryotes, in evolution of eukaryotes, 27EPf Photosynthetic reaction centers, 218–225, 218f, 220f, 223f Photosynthetic units, 218–225 Photosystems, 218–225 effects of herbicides, 225 reaction-center chlorophylls, 219f Photosystem I (PSI), 218–219, 219f, 222–223, 223f, 224f cyclic photophosphorylation and, 226, 226f Photosystem II (PSII), 218–223, 219f, 220f, 224f electron flow, 220–221, 220f light absorption, 218–219, 220f photoinhibition, 222 taking electrons from water, 221–222 Phragmoplast, 341f, 342, 601, 601f Phylloquinone, 223 Phylogenetic trees, 28EP–29EP, 29EPf Phytol tail, 216f PI3K. See PI 3-kinase PI 3-kinase (PI3K), 629 in cancer development, 678, 681, 682f in insulin receptor signaling, 645–646, 645f Pigments, 216. See also Antenna pigments; Chlorophylls energy transfer, 220–221 photosynthetic, 216–217, 217f of photosynthetic unit, 218–219 Pigment granules, 364, 364f Pinocytosis. See Bulk-phase endocytosis PIP3, 628–629, 628f. 645f

piRNAs. See Piwi-interacting RNAs Piwi-interacting RNAs (piRNAs), 460–461 PKA. See Protein kinase A PKA-anchoring proteins (AKAPs), 634, 635f, 643 PKB. See Protein kinase B PKC. See Protein kinase C Placebo-controlled trials, 69HP Planar ring, 43 Plants ATP synthesis, 225–226 C3 plants, 227 C4 plants, 232 Ca2⫹ concentration regulation in, 652–653, 652f CAM plants, 232 carbohydrate synthesis, 226–232, 227f, 228f cell signaling in, 648 cell walls, 266–268, 266f, 267f chloroplasts, 8f–9f, 213-214, 318 cloning of, 512–513 CO2 fixation, 226–232, 227f, 228f cytokinesis in, 601–602, 601f energy sources and storage, 227–228 genetically engineered, 232 ion pumps, 159 membrane functions, 122f microfilaments, 356 microtubules, 331f, 332, 341–342, 341f, 342f mitochondria in, 179–180 movement proteins, 265, 265f organelle interrelationships, 230f, 231 origin of, 27EPf osmosis in, 150–151, 150f peroxisomes and glyoxysomes, 8f–9f, 206, 207f photorespiration, 229–231, 229f photosynthesis, 211-234 plasmodesmata, 265–266, 265f secondary active transport systems, 161 stomata, 232, 652-653 transgenic, 776–778, 777f turgor pressure, 150–151, 150f, 308, 332 vacuolar proteins, synthesis, 282–283 vacuoles, 307–308, 307f Plasma cells, in immune response, 704f, 705–706, 706f Plasmalogens, 206–207 Plasma membrane, 8f–9f, 11f, 120–176 carbohydrates in, 129–130 Davson-Danielli model of, 124 differentiated domains, 144, 144f, 145f dynamic nature, 127–128, 128f, 140–147 in endocytosis, 308, 309f of epithelial cells, 144, 144f of erythrocyte, 145–147 in exocytosis, 303, 303f “fences” in, 143, 143f, 144f lipid mobility and, 143 functions of, 121–122, 147 lipids in, 126f in myelin sheath, 125f skeleton of, 137, 143, 143f, 146f, 147 structure, 123–125, 123f, 124f three-layered nature, 120–121, 121f targeting proteins to, 300 in tight junctions, 260–262, 261f transport of substances, 147–161 Plasmids, bacterial, eukaryotic DNA cloning in, 767–768, 767f, 768f Plasmodesmata, 8f–9f, 122f, 265–266, 265f Plasmolysis, 150, 150f Plastids, 47. See also Chloroplasts Plastocyanin (PC), 222–223, 223f

INDEX I-27 Plastoquinol (PQH2), 220f, 221, 221f, 223f Plastoquinones (PQ), 220–221, 220f, 221f Platelet aggregation, 245–246, 247f Platelet-derived growth factor (PDGF), oncogenes encoding, 680 Plectin, 333f, 353f Pluripotent cells, 21HP induced, 22HP–23HP, 22HPf, 519 Plus (fast-growing) end of actin filament, 358, 358f, 359f of microtubule, 330–331, 340 dynamic behavior and, 342–344, 343f, 344f Pneumococcus, S and R forms, 421EP, 421EPf Polar, charged side chains, of amino acids, 51–53, 52f, 53f Polar, uncharged side chains, of amino acids, 52f, 53 Polar (hydrophilic) head groups, of membrane lipids, 123, 123f, 126, 126f Polarity, 34 of cell action potential and, 165–166 membrane domains and, 144 diffusion across membranes and, 149 electrostatic attraction and, 37 of microtubules, 334, 340 establishment of, 340f, 341 of motile cell, 377, 378f motor proteins and, 334 of nucleotides, 393–394, 393f of water, 34 Polar molecules, 34 hydrophilic nature, 34 Poleward flux of microtubules, 590–592, 592f Polyacrylamide gel electrophoresis (PAGE), 756–757, 756f SDS–PAGE, 146f, 757 two-dimensional electrophoresis, 71f, 757, 757f Polycystic kidney disease (PKD), 349HP–350HP Polymers, 41, 42f Polymerases, DNA, 550–552, 551f, 554, 554fn, 555f in eukaryotes, 561–562, 562f structure and function of, 554–558, 555f, 556f, 557f, 558f Polymerase ␤, 567 Polymerase chain reaction (PCR), 769–771, 770f applications of, 770–771 Polymerase ␩, 569 Polymer dynamics, measuring, 329–330, 330f Polymerization, 41, 42f Polymorphisms, 130, 130f, 416–417, 416f disease risk and, 417HP–418HP, 417HPfn Polyoma virus, cancer caused by, 668 Polypeptide chains assembly of, 50–51, 282–283, 282f elongation, 471–474, 472f initiation, 468–471, 469f mechanism for, 428, 428f sites of, 280 synthetic, 73 conformations of, 55–57, 56f, 57f alpha helix, 55, 56f beta-pleated sheet, 56–57, 56f native, 428f DNA base pairs and, 513 genes and, 70 Polyploidization, 406–407 Poly(A) polymerase, 449, 536 Polyribosomes (polysomes), 475–477, 475f, 476f Polysaccharides, 45–47 cellulose, chitin, and glycosaminoglycans, 46f, 47, 47f

of extracellular matrix, 239f glycogen and starch, 45–47, 46f Polyspermy, 388 Poly(A) tail, 448 Polytene chromosomes, 392, 392f Polyunsaturated fatty acids, 48, 126 Porins, 182, 182f, 213 Porphyrin ring, 191f, 216, 216f Positive selection, 413fn of T cells, 721, 721f Positively supercoiled DNA, 398, 430f, 550 Posttranscriptional gene silencing (PTGS), 456 Posttranscriptional modifications, in RNA bases, 439–440, 439f, 465 Posttranslational control, 541–542, 541f Posttranslational modifications (PTMs), defined, 54 Potassium equilibrium potential (EK), 165–166 Potassium ions, concentration gradient, 157 electrical gradient compared with, 165 Potassium ion channels, 153–156, 154f, 155f, 156f conformational changes in, 136–137, 137f in nerve cells, 165 in plant guard cells, 652-653 prokaryotic, 152–153 voltage-gated, 152–156, 153f, 154f, 155f, 156f in action potential, 166, 166f inactivation, 155, 156f opening and closing of, 153–156, 154f, 156f Potassium leak channels, 165, 166f Potato spindle-tuber disease, 26 pRB protein, 682f in cell cycle regulation, 674, 675f Precancerous cells, 607f, 670 Precipitation, selective, of proteins, 752–753 Preclinical trials, in drug development, 75f Precocious puberty, GPCRs mutation in, 626HP Prednisone, 725HP Pregnancy, antibodies in, 713 Preinitiation complexes (PIC) in transcription, 441–443, 443f assembly, 442f, 443f in translation, 469 Premature termination codons, 474–475 Premeiotic S phase, of meiosis, 603–604 Pre-miRNAs, 456f, 460 Pre-mRNAs, 428f, 448 methylguanosine cap and poly(A) tail, 449–450, 449f processing, 448–454, 448f, 449f, 453f overview, 450f splice sites, 449, 450f, 452f Preprophase band, 341f, 342, 601 Pre-RC. See Prereplication complex Prereplication complex (Pre-RC), 560–561, 560f Pre-RNAs, 434 modification of, 439, 439f in nucleolus, 435–436 processing of, 437–440, 438f snoRNA role with, 438–440 synthesis of, 436 Pre-rRNAs, processing, self-splicing, 477EPf Pribnow box, 433 Primary cilia, 345, 349HP, 349HPf Primary culture, 750 Primary electron acceptors, 219, 220f, 223, 223f Primary oocytes, 603, 603f Primary spermatocytes, 603, 603f Primary structure, of proteins, 55, 55f Primary transcripts, 434. See also Pre-RNAs for micro-RNA, 456f, 460 for mRNA conversion to mRNA, 448–454

cotranscriptional processing, 448f introns and, 445–446, 453f splicing, 449–454 Primase, 552, 553f, 554, 554f, 561–562, 562f Primers, 550–551, 551f RNA, 552–553, 553f, 554f Pri-miRNAs, 456f, 460 Primordial germ cells, embryonic, 243, 531f Primosome, 554, 554f Prion proteins, 66HP, 137 Prion protein cellular protein (PrPC), 66HP, 137 PRNP, 66HP Procentriole, 586 Processive movement, 335 of motor proteins, 335, 362–363, 363f of RNA and DNA polymerases, 431, 554–556 Professional APCs, 707–708, 707f, 708f, 722–723 Profilin, 373, 377f, 378 Progenitor cells, in tumor development, 669, 670f Programmed cell death. See Apoptosis Prokaryotes. See also Bacteria cytoskeleton, 324–325 as eukaryote ancestors, 26EP, 27EPf, 29EP evolutionary relationships, 28EPt gene expression control in, 484–488 bacterial operon, 484–487, 484f, 485f genome organization, 484 riboswitches, 487–488 lateral gene transfer in, 29EP transcription in, 432–433, 432f translation in, initiation of, 536–537, 536f Prokaryotic cells, 7–8. See also Bacteria advent of, 9f biofilms of, 13 cell division of, 12 diversity of, 14–15, 15t domains of, 14 eukaryotic cells compared with, 8–13, 8f–9f, 10t DNA of, 10 evolutionary relationships, 28EP–30EP RNA polymerases, 433–434 shared properties of, 8–9 structure of, 10–12, 11f transcription factors, 434 flagella of, 12, 13f metabolism in, 13 numbers and biomass of, 15t size of, 17, 19f structure of, 8f–9f, 16f types of, 10t, 14–15, 14f Prokaryotic K⫹ channels, 152–153, 153f, 154f Proline (Pro, P), 52f, 53–54 Prometaphase, of mitosis, 582f, 588–590, 589f, 590f, 591f, 598f Promoters, 525–526 of bacterial operon, 485, 485f core, 522, 523f, 526fn distal, 523f, 525, 526fn in transcription, 430 in eukaryotes, 441, 442f, 522-524 functions, 430–431 in prokaryotes, 432–433, 432f, 433f proximal region, 522, 523f, 526fn for RNA polymerases, 441–443, 442f, 443f Propagation of nerve impulse, 167–168, 167f Prophase of meiosis, 603–604, 604f, 605f diakinesis, 607 diplotene, 606–607, 606f leptotene, 604, 604f, 605f

I-28 INDEX Prophase (continued ) pachytene, 606, 606f zygotene, 604, 604f, 605f of mitosis, 582f, 583–588, 583f, 584f mitotic spindle formation, 586–588, 587f, 588f nuclear envelope and organelles in, 588 Proplastids, 213 Prostate cancer, 665 development of, 669 drugs preventing, 669 genes associated with, 673, 675f incidence and mortality of, 665f screening tests for, 73 translocation in, 504HP Prosthetic groups, 191 Proteasomes function of, 541–542, 718, 719f in quality control, 288, 288f structure and function of, 541, 541f Proteins, 50–65, 50f, 70–77 adaptation and evolution, 76–77, 76f, 77f of biosynthetic (secretory) pathway, 272 building blocks of, 50–65, 50f, 70–77 amino acid side chains, 51–54, 51f, 52f, 53f amino acid structure, 50–51, 51f Ca2⫹-binding, 651–652, 651t, 652f cancer-promoting, inhibition of, 689–692, 690t, 691f in cell signaling, 621 CJD and, 66HP in complex with DNA, 32f conformational changes, 115 allosteric modulation, 115–116, 116f covalent modification, 115 molecular motors and, 334 phosphorylation and, 157–159, 158f conjugated, 95 denaturation of, 63 diversity. See Alternative splicing families of, 76–77, 405 fibrous, 58 functions of, 50, 50f, 83EP identification of, 72–73 globular, 58 histone residues and, 499, 500f homologous forms, 76 hub, 62–63, 63f interactions of, 61–65, 61f, 62f, 63f with nucleic acids, 429–430, 467–468 isoforms, 76, 407 isolation, purification, and fractionation of, 752 liquid column chromatography, 753–756, 753f, 754f, 755f PAGE, 756–757, 756f, 757f protein measurement and analysis techniques, 757–758, 758f, 758fn selective precipitation, 752–753 man-made, 74, 75f measurement and analysis of, 757–758, 758f, 758fn modification in endoplasmic reticulum, 283, 285–288, 287f Golgi complex, 292 multiple functions, 70fn multiprotein complex, 61, 61f posttranslational modifications of, 54 precursors of, 42 as presumed genetic material, 420EP–422EP proteome, 70 proteomics, 70–73, 71f, 72f, 73f recruiting of, 272–273 redox control of, 228–229, 229f

relation to genes, 427–429 self-assembly of, 64, 64f stability of, 541–542, 541f structure of, 54–61 determination of, 758–760, 758f, 759f, 760f domains of, 59, 59f dynamic changes within, 59–60, 60f primary, 55, 55f quaternary, 60–61, 61f secondary, 55–57, 56f, 57f tertiary, 57–58, 57f water in, 38, 39f study techniques, 71–73, 71f, 275 isolation, purification, and fractionation, 71, 71f localizing by fluorescence microscopy, 327, 327f types of, 50, 50f unfolding of, 63 Protein coats on vesicles, 295–297, 295f, 299–300 early studies, 319EP, 319EPf in vesicle budding, 276, 277f of viruses, 23–24, 23f, 24f Protein engineering, 73–76, 75f carbon bond breaking with, 74 Protein folding, 63–65, 64f, 65f, 287–288, 428f chaperones in, 65, 65f, 80EP–83EP, 83EPf, 287–288, 288f, 289f controversies in, 64, 64f, 65f denaturation of, 63–64, 64f misfolded proteins, 82EP, 287–288, 288f fate of, 288 Huntington’s disease and, 404HP–405HP quality control screening, 288f misfolding, 66HP–70HP pathways of, 64, 64f, 65f refolding of, 64, 64f Protein kinases, 93f, 115 activation, 246, 248f, 249, 260 caspases targeting, 658 in cell cycle control, 575–579, 575f, 579f, 611EP-614EP Cdk inhibitors, 577 Cdk phosphorylation/dephosphorylation, 576f, 577, 577f controlled proteolysis, 577f, 578 cyclin binding, 576–577, 576f subcellular localization, 578, 578f in cell signaling, 260, 619–621, 619f, 636-646 in glucose mobilization, 632, 633f oncogenes encoding, 680 in plant cell walls, 268 in transcription, 443–444 Protein kinase A (PKA), 632–634, 633f, 634f, 635f Protein kinase B (PKB, AKT), 645–646, 645f in cancer development, 678, 681, 682f Protein kinase C (PKC), 629f, 630 Protein microarrays, 72, 73f Protein phosphatases, 304t, 725 in cell signaling, 619–620, 619f, 632, 635f in mitosis, 585 Protein–protein interactions, 61–65, 61f, 62f, 63f in Drosophila melanogaster, 62 folding, 63–65, 64f, 65f of hub proteins, 62–63, 63f multiprotein complex, 61, 61f network of, 62–63, 63f phosphotyrosine-dependent, 638, 638f regulation of, 61–62 techniques for determining, 754–756, 754f, 755f yeast two-hybrid assay for, 62, 755 Protein pumps, gated channels compared with, 161

Protein superfamilies, 76–77 Protein synthesis, 468–477, 476f blocking by RNA interference, 456f, 457 ending by phosphorylation, 288, 289f in eukaryotes compared with prokaryotes, 476, 476f initiation, 468–471, 469f on membrane-bound compared with free ribosomes, 280–282 on membrane-bound ribosomes, 282–283, 282f, 318, 318f quality control mechanisms, 287–288, 288f role of microsomal membranes, 277–278 in rough ER, 273, 274f, 280-284 signal hypothesis, 281–282 termination, 474 Protein transport along biosynthetic/secretory pathway, 271–272, 272f back to endoplasmic reticulum, 298 across endoplasmic reticulum membrane, 281–282, 281fn, 282f into endoplasmic reticulum membrane, 284f from endoplasmic reticulum to Golgi complex, 272f, 289, 290f, 296f in Golgi complex, 290, 292–295, 294f from Golgi complex to endoplasmic reticulum, 296f sorting at trans Golgi network, 299–300, 299f from trans Golgi network to destination, 272f, 299–303, 299f into chloroplasts, 318, 318f across membranes, chaperones in, 317 into mitochondria, 316–318, 317f into peroxisomes, 316 posttranslational, 281fn, 316–318 retrograde, 294f, 296f, 298 sorting (recognition) signals, 298 studies of, 273–275, 274f, 275f targeting to destinations, 272, 299–300, 299f Protein-tyrosine kinases activation of, 638 in chronic myelogenous leukemia, 75, 75f discovery of, 695–696 in cell signaling, 636–646 insulin receptors as, 644, 644f in signal transduction pathways of lymphocyte activation, 723–724 Protein-tyrosine phosphorylation, 620–621, 620f, 636, 725 in cancer, 679–680, 682f, 688–691, 695–696 in cell-cycle regulation, 577 discovery of, 695-696 downstream signaling processes activated by, 638–640, 639f end of, 640 in insulin receptor signaling, 644 diabetes mellitus and, 646 glucose transport, 646, 646f human longevity and, 647HP–648HP, 647HPf insulin receptors as protein-tyrosine kinases, 644, 644f insulin receptor substrates 1 and 2 in, 644–646, 645f MAP kinase adaption, 643–644, 643f phosphotyrosine-dependent protein–protein interactions in, 638, 638f in plants, 648 protein kinase activation in, 638 Ras-MAP kinase pathway activation by, 640–644, 641f, 642f, 643f receptor dimerization in, 636–638, 637f

INDEX I-29 Proteoglycans, 239f, 240–241, 241f Proteolysis, 613EP in anaphase of mitosis, 592, 593f in cell cycle control, 577f, 578 in regulating protein lifetime, 541-542 Proteome, 70 Proteomics, 70–73, 71f, 72f, 73f, 275, 757-758 medical uses, 72–73 research questions, 70–71 Protists, 15–16, 15f, 27EPf Protofilaments, of microtubules, 330–331, 331f Protons, 39–40, 39fn in ATP formation, 187, 187f cytochrome oxidase and, 196–197, 197f exclusion from aquaporin channels, 151, 151f movement (translocation) across mitochondrial membrane, 187, 187f, 193, 194f, 196–199, 200f across thylakoid membrane, 222, 224f, 225–226, 226f movement through ATP synthase, 202, 204–205, 204f “substrate” compared with “pumped,” 197 Proton-conduction pathways (“proton wires”), 193 Proton gradients, 186, 187f ATP formation and, 193 in light-driven active transport, 160 from photosystem II, 220f, 221 proton-motive force and, 198 across thylakoid membrane, 223, 225–226, 226f Proton-motive force (⌬p), 198–199, 198f, 205 Proton pumps, 193 in bacteria, 160 in plants, 159 redox-driven, 196, 197f Proton transfer in enzyme actions, 99, 100f in oxidation, 112 “Proton wires” (proton-conduction pathways), 193, 197 Proto-oncogenes, 671 activation of, 671–672, 671f, 672f types of proteins encoded by, 679f Protoplasts, 751 Proviruses, 25–26 PSA test for prostate cancer, 73 Pseudogenes, 408, 408f Pseudomonas aeruginosa, in cystic fibrosis, 162HP–163HP, 163HPf PTB domain. See Phosphotyrosine-binding domain PTEN gene, in cancer development, 678, 682f P-type ion pumps, 158–159, 159f Pulse-chase experiments, 273, 274f Purification, of proteins, 752 liquid column chromatography, 753–756, 753f, 754f, 755f PAGE, 756–757, 756f, 757f protein measurement and analysis techniques, 757–758, 758f, 758fn selective precipitation, 752–753 Purines, 78, 78f, 394 Pyranose ring, 44, 45f Pyrimidines, 78, 78f, 107HPt, 394 Pyrophosphates, in transcription, 430f Pyruvate, 111f, 116f anaerobic oxidation (fermentation), 113–114, 114f generation in glycolysis, 183, 185 in TCA cycle, 185f, 186f Pyruvate dehydrogenase, 61, 61f

Q Q cycle of proton translocation, 194f, 222, 223f Quality control mechanisms mRNA surveillance, 474–475 in protein synthesis, 287–288, 288f Quaternary structure, of proteins, 60–61, 61f Quinolones, 106HP, 107HPt Quinones, 191f, 220–221, 220f, 221f, 223 as free radicals, 220f R Rabs, 301, 301f, 363f, 364 Rabies, host range of, 24–25 Radiation. See also Gamma radiation; Ultraviolet radiation as cancer therapy, 665, 677, 687 as carcinogen, 667 DNA damage from, 568–569, 568HP–569HP as mutagen, 392 Radioisotopes, 748–749, 749t, 750f Radon, DNA damage from, 569HP Raf protein, 642 oncogenes encoding, 680, 682f Raloxifene, 669 Ran protein, 492 Ras-MAP kinase cascade, RTK activation of, 640–644, 641f, 642f, 643f RAS oncogene, 640, 679, 682f, 697P Ras protein, 640–644, 641f, 642f, 643f as membrane proteins, 138 Rate-zonal sedimentation. See Velocity sedimentation RB gene, 682f in cell cycle regulation, 674, 675f in retinoblastoma, 673, 673t, 674f rDNA. See Ribosomal DNA Reactants concentration of, free-energy change and, 91–92 at transition state, 96–97, 96f Reactions. See Biochemical reactions; Chemical reactions Reaction cascade, 619f, 632 Reaction-center pigments (chlorophylls), 219f, 220, 220f, 222–223, 223f electron transfer, 220 Reaction rate, 95–99, 95t, 102–105, 103f, 105f activation energy and, 96–97, 96f effect of enzymes on, 95, 95t initial velocity, 103, 103f Lineweaver-Burk plot, 103, 103f maximal velocity, 103, 103f effects of enzyme inhibitors, 104–105, 105f Michaelis-Menten equation, 103 Michaelis-Menten relationship, 102 rate-limiting factors, 103 reactant concentration and, 90–91, 102–104, 103f Reading frame, 468, 473–474 frameshift mutations, 474 Reannealing (renaturation) of DNA, 400, 401f Rebif, 726HP RecA protein, 610 Receptors in cell signaling, 618–619 survey of, 621 GPCRs, 617f–618f blood glucose regulation, 631–634, 631f, 632f, 633f, 633t, 634f, 635f disorders associated with, 625HP–626HP, 625HPf, 626HPt second messengers of, 621–622, 622f, 622t, 627–630, 627f, 628f, 629f, 630f, 631t in sensory perception, 634–636

signal transduction by, 622–627, 623f, 625f specificity of responses of, 630–631 insulin, 644 diabetes mellitus and, 646 glucose transport, 646, 646f human longevity and, 647HP–648HP, 647HPf insulin receptors as protein-tyrosine kinases, 644, 644f insulin receptor substrates 1 and 2 in, 644–646, 645f Receptor down-regulation, 312, 624, 625f Receptor-ligand complexes, endocytosis of, 313f, 624, 625f Receptor-mediated endocytosis (RME), 308–311, 309f, 319EP–322EP, 624, 625f in therapy of lysosomal disorders, 307HP Receptor protein-tyrosine kinases (RTKs), 621, 626, 636-646 in cancer, 679–680, 682f, 688–691 dimerization of, 636–638, 637f downstream signaling processes activated by, 638–640, 639f Ras-MAP kinase pathway activation by, 640–644, 641f, 642f, 643f termination of signaling by, 640 Recessive alleles, 387–388 Recognition sequences for cytoplasmic RNA localization, 537 for miRNA binding, 460 for poly(A) tail formation, 449–450 for RNA longevity, 539 for snoRNA binding, 439-440 for RNA splicing, 450, 450f Recognition signals in protein transport, 272, 298 in protein uptake, 316 Recombinant DNA technology, 764 DNA cloning in, 766–769, 767f, 768f, 769f recombinant DNA formation in, 766, 766f restriction endonucleases in, 764–766, 765f Recombination frequency, 391–392, 417, 419HP, 611 Recombination nodules, 606 Recoverin, 59f Redox control, 228–229, 229f Redox potentials, 189–190 of electron carriers, 192–193, 192f Redox reactions. See Oxidation-reduction reactions Reduced coenzymes ATP formation and, 186–187 in electron-transport chain, 187 energy from, 187 in fatty acid cycle, 186f in oxidative phosphorylation, 186, 187f in TCA cycle, 185f, 186 Reduced state, 109–110 Reducing agents, 109, 189–190 during photosynthesis, 214–215, 221, 225 Reducing power, 114–115 Refractory period, after action potential, 166, 167f Regulated secretion, 271, 272f, 300, 303 Regulators of G protein signaling (RGSs), 625 Regulatory gene, of bacterial operon, 485, 485f Regulatory T lymphocytes (TReg cells), 710, 724 Release factors (RFs, eRFs), in termination of translation, 474 Remicade, 726HP Renaturation (reannealing), of DNA, 400, 401f Repair of DNA, 545, 564–568, 565f base excision repair, 566–567, 566f, 567f double-strand breakage repair, 567–568, 568f mismatch repair, 567

I-30 INDEX Repair of DNA (continued ) mutant genes involved in, 681 nucleotide excision repair, 565, 566f p53 role in, 675–677, 675f, 676f, 677f XP and, 568–569 Reperfusion damage, 255HP Replica plating, in DNA cloning, 767–768, 768f Replication of DNA in bacterial cells, 549–554, 549f, 550f, 551f, 552f, 553f, 554f, 554fn, 555f DNA polymerase structure and function, 554–558, 555f, 556f, 557f, 558f in eukaryotic cells, 558–564, 559f, 560f, 561t, 562f, 563f histones and, 509–510 semiconservative nature of, 546–549, 546f, 547f, 548f as DNA function, 396 of HIV, 410 telomeres and, 506, 507f, 508 of viruses, 457 Replication foci, 562, 563f Replication forks, 549, 549f in eukaryotes, 561–562, 561t, 562f machinery operating at, 553–554, 554f, 554fn, 555f Replicative senescence, 508 Replisome, 555f, 561, 562f Repressible operon, 485 RER. See Rough endoplasmic reticulum Resensitization, of receptor signaling, 625 Residual body, 305, 305f Resolution of light microscopes, 733–734, 734f magnification and, 733–734, 734f of SEMs, 747 of TEMs, 741 Respiration aerobic, 178–210 anaerobic ATP formation compared with, 188HP photosynthesis compared with, 215, 215f Respiratory tract cilia, 345, 349HP Response elements, 524, 525, 525f, 632, 633f Response to stimuli, of cell, 6 Resting potential, 164–165, 166f measurement of, 164–165, 165f membrane potential compared with, 165 Restriction endonucleases, 764–766, 765f discovery of introns and, 446f Restriction maps, in recombinant DNA technology, 764–766, 765f Retinal, in bacteriorhodopsin and rhodopsin, 160–161, 160f, 622f, 623 Retinitis pigmentosa (RP), 624, 625HP–626HP Retinoblastoma, genes associated with, 673, 673t, 674f Retrieval, of “escaped” endoplasmic reticulum proteins, 298, 298f Retrotransposons, 410 Retroviruses (RNA tumor viruses), 668 discovery of, 694EP–697EP, 694EPf, 694EPt, 695EPf, 696EPf DNA transfer into eukaryotic cells and mammalian embryos via, 775–776 evolutionary origins, 410 oncogenes in, 671 Reuptake of neurotransmitter, 170 Reverse genetics, 778 Reverse transcriptase Discovery of, 694EP-695EP drug resistance and, 108HP

transposable genetic elements and, 410–411, 410f use in cDNA synthesis, 774, 775f Reversible enzyme inhibitors, 104 RFC, 562, 562f RGD sequence. See Arginine-glycine-aspartic acid sequence R group. See Side chain Rheumatoid arthritis, autoimmunity and, 725HP–726HP RhoA, 599–600 Ribonuclease A, protein folding of, 63–64, 64f Ribonuclease P, 440 Ribonucleic acid. See RNA Ribosomal DNA (rDNA), 403, 435 of human genome, 435 in oocytes, 436, 436f Ribosomal RNA (rRNA), 78–79, 78f, 429, 429f dynein transport of, 338 electrophoretic “fingerprint,” 28EPf in elongation of nascent polypeptide, 472f evolutionary relationships and, 28EP–30EP, 28EPt, 29EPf functions of, 470 in initiation of protein synthesis, 469f methyl groups of, 437–438 in nucleolus, 435–436 processing of, 438–439, 439f in eukaryotes, 434, 453f in ribosomes, 470 sense and antisense, 457, 461 small noncoding, 455–457 structure, 77–79, 429, 429f transport of, 492–493 Ribosomes, 8f–9f, 10–12, 11f, 12f, 279, 279f, 429, 434, 476f E (exit) site of, 470, 471f free compared with membrane-bound, 280, 282 mammalian, 435, 435f in mitochondria, 181f in nucleoli, 440 in protein synthesis, 282–283, 282f, 428f, 468–471, 469f attachment to mRNA, 468, 469f, 470fn, 476f in elongation, 471–474, 472f movement along mRNA, 472–474, 472f movement along mRNA and reading frame, 468 P (peptidyl) site of, 469–470, 469f, 471f relation to ribozymes, 455, 479EP A (aminoacyl) site of, 470, 471f size of, 17, 19f stop codons and, 474–475 structure-function relationships, 470, 471f subunits, 78, 79–80, 79f, 426f, 470, 471f peptidyl transferase site, 478EP–479EP self-assembly, 79–80 of thylakoid membrane, 318f tRNA binding sites, 470, 471f Riboswitches, 487–488 Ribozymes, 78, 95 discovery, 450, 477EP–479EP in elongation of nascent polypeptide, 471, 472f evolution of, 454 hammerhead, 78–79, 78f man-made, 455 as peptidyl transferase, 471, 478EP–479EP in splicing, 451, 452f, 453 Ribulose 1,5-bisphosphate (RuBP) structure as an enzyme substrate, 98 in carbohydrate synthesis, 227f

in carbon dioxide fixation, 227–228 in photorespiration, 229, 229f, 230f Ribulose bisphosphate carboxylase oxygenase. See Rubisco Ring structures of steroids, 40f of sugars, 43–44 RISC protein complex (RNA-induced silencing complex), 456f, 457, 460 Rituxan, 688, 726HP R-loop formation, research technique, 446, 447f, 448f RNA, 77 as catalysts, 450, 471, 477EP–479EP evolutionary implications, 454 chemical synthesis of, 764 complementarity, 429f complementary base-pairing, 429 DNA-RNA hybrids introns and, 446–448, 447f, 448f during transcription, 430f double-stranded, 455–457, 455f, 456f, 470 editing of, 535 evolution of, 454 folding, 429 functions, 429 man-made, 455 noncoding, 461 polarity, 77f processing control of, 533–535, 534f, 535f quantification of, 771 in ribosomes, 426f 28S, 18S, 5.8S molecules, 434t 5S molecules, 434t synthesis and processing of, 440 16S molecules, 28EP 18S molecules, 28EP structure of, 77f, 78f in studies of molecular evolution, 28EP–29EP, 28EPf synthesis of, 436–440 kinetic analysis of, 438f of precursor, 436–437, 436f, 437f processing precursor, 437–440 snoRNA role in, 438–440, 439f in transposition of genetic elements, 410, 410f type of sugar in, 394fn RNA helicases, 451 RNA interference (RNAi), 272, 278f, 455–457, 455f, 456f, 458HPf, 780, 781f clinical applications of, 458HP–459HP RNA polymerases, 429, 434f, 434t association with DNA template, 429–431, 430f in eukaryotes, 433–434 movement along DNA, 430f operation of, 429–431, 430f in prokaryotes, 431f, 432–433 studies of, 431f transcription factors for, 442fn RNA polymerase II (RNAPII), 434f, 434t, 441, 442f, 443–444, 530 carboxyl-terminal domain, 443, 443f, 453, 453f conformational changes during transcription, 443f coordination of transcription and processing and, 453f paused polymerases, 530 phosphorylation, 443, 443f preinitiation complex, assembly, 443f promoters for, 441–442, 442f in transcription, 441–444 transcription of chromatin, 524-530

INDEX I-31 RNA polymerase III, 434t 5S rRNA synthesis by, 440 tRNA synthesis by, 440, 440f RNA primers, during replication, 552–553, 553f, 554f “RNA-protein world,” 455 RNA silencing, 455–457 RNA tumor viruses. See Retroviruses “RNA world,” 454 RNPs. See Ribonucleoproteins Rotary molecular motors, 203–204 Rotational (rotary) catalysis, in ATP synthase, 202–204, 203f Rough endoplasmic reticulum (RER), 277f, 279, 279f as biosynthetic entry site, 280, 283, 296f cisternae, 279, 279f of eukaryotic cell, 8f–9f, 11f functions, 280–288 glycosylation, 285–288 lumen of, protein processing in, 283 processing of new proteins, 283 in secretory protein synthesis, 281f smooth endoplasmic reticulum compared with, 279 structure, 279f Rough microsomal fractions, for studying membranous organelles, 275–276, 276f 16S rRNA, 28EP 18S rRNA, 28EP, 435-440 RSV, RNA interference therapy, 458HP–459HP RTKs. See Receptor protein-tyrosine kinases Rubisco, 98, 227 in carbon dioxide fixation, 227–228, 227f, 228f in photorespiration, 229–232, 229f Rubisco assembly protein, 81EP Ryanodine receptors (RyRs), 630, 649–650 S S1 myosin fragment, 360, 361f in actin decoration, 359, 359f operation of, 369, 369f, 371f S4 transmembrane helix (S4 voltage sensor), 154–155, 154f potassium leak channels and, 165 Saccharides, 43–47 Saccharomyces cerevisiae, 18f DNA of, 10 Saltatory conduction, 167, 168f Salt bridge, 35 Salt crystal, in water, 36f Sar1 coat protein, 296–297, 296f, 297f Sarcomas, genes associated with, 673, 673t, 680 Sarcomeres, 366 actinomyosin contractile cycle, 369f bands, zones, and lines, 366, 366f, 367f changes during contraction, 366–368, 368f contractile machinery, 367f in excitation-contraction coupling, 370–371 structure, 366f, 367f, 369f Sarcoplasmic reticulum (SR), 280, 649-650 of muscle fiber, 370–371, 371f Satellite cells, 20HP Satellite DNA, 401, 509 localization, 402–403, 403f Saturated fatty acids, 48, 48f Saturation state of membrane fatty acids health implications, 126 membrane fluidity and, 138–139 temperature and, 138–139, 138fn Scaffolding proteins, in MAP kinase pathways, 643–644, 643f

Scanning electron microscopy (SEM), 3, 740–741, 746–748, 747f specimen preparation for, 747 SCF complexes, 578 Schwann cells, 125f, 164f Screening tests For Alzheimer’s disease, 69 for cancer, 72, 73, 670f, 693 in structure-based drug design, 74, 75f Scurvy, 238 SDS. See Sodium dodecyl sulfate SDS-PAGE. See Sodium dodecyl sulfate polyacrylamide gel electrophoresis Secondary active transport, 161, 161f in plants, 161 Secondary culture, 750 Secondary immune response, 706, 710, 710f Secondary oocytes, 608, 611EP Secondary spermatocytes, 608 Secondary structure of proteins, 55–57, 56f, 57f of rRNA, 429, 429f Second messengers, 619, 619f. See also Cyclic adenosine monophosphate of GPCRs, 621–622, 622f, 622t cGMP, 633, 634, 655f, 656 cAMP, 627, 627f calcium, 648–653 DAG, 630 phosphatidylinositol-derived, 311-312, 627–629, 628f, 629f, 645, 645f phospholipase C, 629–630, 629f, 630f, 631t Secretory cells polarity, 280, 281f targeting of proteins and, 300 protein synthesis sites, 280 Secretory granules (vacuoles), 271, 272f, 296f in exocytosis, 303, 303f formation and storage, 300 secretory protein dynamics and, 274f Secretory pathway. See Biosynthetic (secretory) pathway Secretory proteins exocytosis, 303, 303f pathway from synthesis to discharge in glandular epithelial cells, 280, 281f in pancreatic acinar cells, 273, 274f signal sequence (signal peptide), 281, 282f sites of synthesis, 273, 274f, 280 study via mutants, 277–278, 277f synthesis on ribosomes, 282–283, 282f role of microsomal membranes, 276 transport, 274f Sections, for bright-field microscopy, 735 Securin, 592, 593f Sedimentation velocity, of nucleic acids, 761–762, 762f Segmental duplication, 407fn Segregation, law of, 388 physical basis, 389 Selectins, 251–252, 252f, 259f antibodies against, 255HP in inflammation, 255HP, 255HPf inhibitors, 255HP Selective precipitation, of proteins, 752–753 Selectivity filters, in ion channels, 153–154, 153f Selenocysteine, 474fn Self antibodies produced against, 706 distinguishing nonself from, 721–722, 721f Self-antigens, 724HP

Self-assembly of macromolecular complexes, 79–80 of proteins, 64, 64f Self-regulation of cell, 6–7, 6f, 7f Self-splicing introns, 450, 451f, 453f, 477EPf Semagacestat, 69HP–70HP Semiconservative replication, 546–549, 546f, 547f, 548f Semidiscontinuous replication, 552–553, 552f, 553f Semipermeable membrane, 149 Senescence, 508, 671 p53 role in, 677–678, 679f, 682f Sense RNA, 457, 461 Sensory perception, GPCRs in, 634–636 Separase, 592 Separation of charge across membranes, 148–149, 158, 164–167, 198 dipoles, 37 in photosystems, 220–221, 223 from translocation of protons, 198 Sequencing DNA, 771–773, 772f evolutionary relationships and, 28EP–30EP, 28EPt, 29EPf, 402 of genomes, 411–417 of human genome, 412 rough draft compared with finished version, 412 of insulin, 55 phylogenetic trees and, 29EP, 29EPf prokaryote diversity and, 15 taxonomic classification and, 28EP–29EP of viral genome, recreating a virus, 25 SER. See Smooth endoplasmic reticulum Serine (Ser, S), 52f, 53 Serotonin, reuptake of, 170 Serum, 72 Seven-transmembrane receptors. See G protein-coupled receptors Sex chromosomes, abnormal number, 609HP SH2 domain. See Src-homology 2 domain SH3 domain, 61, 62f, 639f, 643f in protein–protein network, 62–63, 63f, 639 Shadow casting, for TEM, 744–745, 744f, 745f Shaker ion channel, 155f Sheep, cloning of, 513, 513f Shells, of electrons, 33, 33f Shine-Dalgarno sequence, initiation codon and, 468 Shortenings (triglycerides), 49 Short tandem repeats (STRs), 402f Sickle cell anemia iPS cells for, 22HP, 22HPf protein primary structure and, 55, 55f Side chain, of amino acids, 51–54, 51f, 52f, 53f in enzymes, 98–99, 98f, 100f genetic code and, 464, 464f nonpolar, 52f, 53 polar charged, 51–53, 52f, 53f uncharged, 52f, 53 posttranslational modifications of, 54 with unique properties, 52f, 53–54 “Sidedness.” See Asymmetry Sigma (␴) factors, in prokaryotic transcription, 432–433, 432f RNA polymerase II transcription factors and, 442f Signal amplification, by second messengers, 632, 633f Signal hypothesis of protein synthesis, 281–282, 282f Signaling. See Cell signaling Signaling pathway, 619–620, 619f “Signaling” receptors, in endocytic pathway, 312, 313f

I-32 INDEX Signal peptidase, in protein processing, 283 Signal recognition particle (SRP), in secretory protein synthesis, 282–283, 282f Signal recognition particle (SRP) receptor, 282–283, 282f retrieval signal, 298 Signal sequences “ER export,” 296 internalization signals, on membrane receptors, 320EP KDEL retrieval, 298, 298f for protein retrieval, 298, 298f in protein sorting, 272, 299 in protein synthesis, 281–283, 282f removal from new protein, 282f, 283 in protein targeting, 272, 300, 316 presequence, 316, 317f recognition signals, 272, 298, 316 Signal transduction, 617–618, 617f–618f in apoptosis, 656–658, 657f extrinsic pathway, 658–659, 658f intrinsic pathway, 659–660, 659f, 659fn, 660f basic elements of, 618–621, 618f, 619f, 620f Ca2⫹ as intracellular messenger, 648 Ca2⫹-binding proteins, 651–652, 651t, 652f IP3 and voltage-gated calcium ion channels, 649 plant regulation of Ca2⫹ concentration, 652–653, 652f visualizing cytoplasmic Ca2⫹ concentration in real time, 649–651, 649f, 650f, 651f in cell survival, 660 convergence, divergence, and cross-talk in, 653–655, 653f, 654f extracellular messengers in, 618–621, 619f survey of, 621 GPCRs in, 617f–618f blood glucose regulation, 631–634, 631f, 632f, 633f, 633t, 634f, 635f disorders associated with, 625HP–626HP, 625HPf, 626HPt second messengers of, 621–622, 622f, 622t, 627–630, 627f, 628f, 629f, 630f, 631t in sensory perception, 634–636 signal transduction by, 622–627, 623f, 625f specificity of responses of, 630–631 human longevity and, 647HP–648HP, 647HPf lipid rafts and, 139–140, 140f across membranes, 122, 122f NO as intercellular messenger, 655–656, 655f protein-tyrosine phosphorylation in, 620–621, 620f, 636 downstream signaling processes activated by, 638–640, 639f end of, 640 in insulin receptor signaling, 644–646, 644f, 645f, 646f, 647HP–648HP, 647HPf phosphotyrosine-dependent protein–protein interactions in, 638, 638f in plants, 648 protein kinase activation in, 638 in Ras-MAP kinase pathway, 640–644, 641f, 642f, 643f receptor dimerization in, 636–638, 637f Signal transduction pathways, in lymphocyte activation, 723–724 Signal transmission cadherins and, 260 catenins and, 258f focal adhesions and, 248–249 integrins and, 246, 248f, 249

across membranes, and cell-adhesion molecules, 259–260 “outside-in,” 246 Sildenafil, 656 Silencing, of genes knockout mice, 778–780, 779f, 779fn RNA interference, 780, 781f in vitro mutagenesis, 778 Silicon, carbon compared with, 40 Silk, conformation of, 57 Simian sarcoma virus, 680 Simian virus 40 (SV40), cancer caused by, 668 Simple diffusion, 149–156 Simponi, 726HP SINEs (short interspersed elements), 403 Alu repeated sequences, 410 Single-copy (nonrepeated) DNA sequences, 405–406, 405f Single-molecule assays in cytoskeleton studies, 328, 329f, 351f, 362f, 363f in transcription, 431f Single nucleotide polymorphisms (SNPs), 416 disease-linked, 418HP–419HP haplotypes and, 419HP, 419HPf Single-particle reconstruction, 759 Single-particle tracking (SPT) technique, 142–143 Single-stranded DNA-binding (SSB) proteins, 553–554, 554f, 561, 561t, 562f Singlet oxygen (1O*), 217 sis oncogene, 680 Site-directed cross-linking, in membrane protein studies, 137f Site-directed mutagenesis (SDM), 74–75, 135-137, 778 Site-directed spin labeling, 137f Situs inversus, 349HP Sizes of cells, 17, 19f diffusion and, 17 Skeletal muscles chemical neurotransmitters and, 169 mitochondrial abnormalities, 207HP–208HP, 207HPf structure, 364–366, 366f Skeletal muscle cells role of smooth endoplasmic reticulum in, 280 structure, 364–366, 366f Skeletal muscle fibers, fast- compared with slowtwitch, 188HP, 188HPf Skeleton of Golgi complex, 290 of plasma membrane, 137, 143, 143f, 146f, 147 lipid mobility and, 144 spectrin-actin network, 146f, 147 Skin basement membrane, 236f blistering diseases, 250, 257, 356 impermeability, and tight junctions, 262 keratin intermediate filaments in, 354–356, 356f structure, 236, 236f Skin cancer. See also Melanoma and DNA repair defects, 568HP–569HP ultraviolet radiation role in, 668 Sliding filament model of muscle contraction, 366–371, 368f Sliding-microtubule theory of flagellar motility, 353, 353f Slow-twitch muscle fibers, 188HP, 188HPf Small, nucleolar ribonucleoproteins, 438–440, 439f Small, nucleolar (sno) RNA, 438–440, 439f Small interfering RNAs (siRNAs), 456f, 457 man-made, 458HP

microRNAs compared with, 460 RNA interference using, 278, 780, 781f use in medicine, 458HP–459HP use in mammals, 457 Small noncoding RNAs, 455–461 Small nuclear ribonucleoproteins (snRNPs), 450–451 operation of, 452f structure, 453f types of, 451, 452f, 453, 453f Small nuclear RNAs (snRNAs), 450–451, 453, 453f operation of, 452f types of, 452f, 453f Smell, GPCR role in, 634–635 Smoking, cancer caused by, 668 Smooth endoplasmic reticulum (SER), 277f, 279, 280, 280f of eukaryotic cell, 8f–9f, 11f of muscle cells. See Sarcoplasmic reticulum rough endoplasmic reticulum compared with, 279 Smooth muscle, and gap junctions, 263–264 SNARE proteins, 301f, 302, 303f snoRNA. See Small, nucleolar RNA SNPs. See Single nucleotide polymorphisms snRNAs. See Small nuclear RNAs Soap, 47–48, 48f Sodium dodecyl sulfate (SDS), solubilization of membrane proteins with, 133 Sodium dodecyl sulfate polyacrylamide gel electrophoresis. See SDS-PAGE Sodium equilibrium potential (ENa), 165 Sodium ion channels, voltage-gated in acetylcholine receptor, 173EP–174EP, 174EPf in action potential, 166, 166f in synaptic transmission, 169, 169f Sodium ions, intestinal cotransport with glucose, 160–161, 161f Sodium-potassium pump (Na⫹/K⫹-ATPase), 157–159, 158f, 161, 161f, 165 concentration gradient and, 199, 200f Soft tissue cancers genes associated with, 673, 673t incidence and mortality of, 665f Solid tumors development of, 669–670, 670f inhibition of cancer-promoting proteins in, 691 Solutes, 38 diffusion across membranes, tight junctions and, 262 movement across membranes, 121–122, 122f, 147–161 diffusion, 149–156 energetics, 148–149 osmosis and, 149–150, 150f polarity, 149 Somatic cell nuclear transfer (SCNT), 21HP–22HP Somatic hypermutation, of Igs, 715 Somatic mutation, 626HP Somavert, 75 Sorting signals, in protein transport, 272, 299, 300 Sos, Ras interaction with, 642 Southern blot, 763, 763f Specialized cells, 15–17, 15f, 16f Specific activity, of purified proteins, 752 Specificity, of proteins, examples of, 50. 95-96, 710 Specimen preparation for microscopy for bright-field microscopes, 735 for SEM, 747 for TEM, 742–743, 743f cryofixation and frozen specimens, 743–744 freeze-fracture replication and freeze etching, 745–746, 745f, 746f

INDEX I-33 negative staining, 744, 744f shadow casting, 744–745, 744f, 745f Speckles, fluorescence speckle microscopy, 327, 344f Spectrin, of erythrocyte membrane, 146f, 147 Spectrin-actin network, of plasma membrane skeleton, 146f, 147 Spectrophotometry, 757–758, 758f, 758fn Spectroscopy, electron paramagnetic resonance, use in membrane protein studies, 136–137, 137f “Speech gene,” 414 Sperm (spermatozoa) flagella, 13f beat pattern, 350f sliding of microtubules, 353f formation of, 603, 603f mitochondria, 179–180, 179f plasma membrane domains, 144, 145f polyspermy, 388 Spermatids, 603, 603f Spermatocytes primary, 603, 603f secondary, 608 Spermatogonia, 603, 603f S phase, 573, 573f Sphingolipids, in membranes, 126–127, 126f Sphingomyelin (SM), 126, 126f, 127t, 129f Sphingosine, 126–127, 126f Spinal cord injuries, ES cells for, 21HP Spindle assembly checkpoint (SAC), 596–597, 596f Spindle pole cytokinesis and, 599–601, 600f formation without centrosomes, 588, 588f Spleen, 700f Spliceosomal helicases, 451 Spliceosomes, 450–451, 450fn, 453 assembly, 452f coordination of transcription and processing and, 453f operation of, 452f self-splicing introns compared with, 453f Splice sites, in pre-mRNA, 450, 450f, 452f Splicing, 449–454 alternative, 454, 534–535, 535f differences between organisms and, 412–413 coordination with transcription and polyadenylation, 453f evolutionary implications, 454 intermediates formed, 452f, 453, 453f mechanism, 450–454, 452f molecules required, 450–451 precision, 449 self-splicing of introns, 450, 451f, 453f, 477EPf splice sites, 450, 450f, 452f Splicing signals, 450, 450f, 450fn, 534-535 Split genes, 444–448 evolutionary implications, 454 introns and exons, 445–446 Spongiform encephalopathy, 66HP Spontaneous mutation rate, 557–558 Spontaneous processes, 88–90 Spores, formation of, 603, 603f Sporic meiosis, 603, 603f Sporogenesis, 603, 603f Sporophyte, 603, 603f SPT technique. See Single-particle tracking technique Src-homology 2 (SH2) domain, 638–640, 638f, 639f SRC oncogene, 680 discovery of, 695EP–696EP, 696EPf Src proteins, as peripheral membrane proteins, 138 Src protein kinase, integrins and, 248f, 249

SRP. See Signal recognition particle SSB proteins. See Single-stranded DNA-binding proteins Stacked bases, of DNA, 396 Staining for electron microscopy, 743–744, 744f with light microscopes, 735, 735f Standard conditions in thermodynamics, 91, 91fn, 189 Standard free energy change (⌬G⬚’), 91–92, 91fn, 91t in redox reactions, 190 in TCA cycle, 185f Standard redox potentials (E0), 189–190, 189f, 190fn, 190t Starch, 45–47, 46f, 228 Stargardt’s macular dystrophy, ES cells for, 21HP Starvation, autophagy during, 305 Statin cholesterol-lowering drugs, 314 STAT transcription factors RTK interaction with, 639f, 640 in signal transduction pathways of lymphocyte activation, 724 Steady-state metabolism, 93–94 equilibrium compared with, 93–94, 94f “Stealth” liposomes, 128, 129f Stearic acid, 48, 48f melting point of, 139t Stelara, 726HP Stem cells, 20HP adult, 20HP, 20HPf cancer, 692 embryonic, 21HP–22HP, 21HPf, 779 hematopoietic, 20HP as autoimmune disease therapy, 726HP differentiation of, 703–704, 703f telomerase in, 508 immunologic rejection and, 21HP induced pluripotent, 22HP–23HP, 22HPf, 519 transcription factors as determinants of, 519 “transdifferentiation,” 21HP, 23HP in tumor development, 669, 670f Stereocilia, 364, 365f, 373 Stereoisomerism, 44, 44f, 45f of amino acids, 50, 51f of glyceraldehyde, 44, 44f Steroids, 49 in cell signaling, 621 structure of, 49f Steroid hormone receptors, in cell signaling, 621 Steroid rings, 40f Stochastic optical reconstruction microscopy (STORM), 740, 740f Stomach acid secretion, by H⫹/K⫹-ATPase, 159, 159f Stomach cancer. See Gastric cancer Stomata of leaves, 213f, 652-653 in C4 plants, 232 in CAM plants, 232 Store-operated calcium entry (SOCE), 650–651, 651f STORM. See Stochastic optical reconstruction microscopy Strands, DNA, separation of, 399-400, 549–550, 550f Streptomycin, 106HP Stroke, integrin-targeting drugs for, 246, 247f Stroma, of chloroplast, 214 Stroma targeting domain, in chloroplast protein uptake, 318, 318f Stroma thylakoids (stroma lamellae), 214f Strong acids and bases, 39, 39t

Structural genes of operons, 484–485, 485f Structural polysaccharides, 46f, 47, 47f Structural variants, chromosomal, 416–417, 416f Structure-based drug design, 75–76, 75f Structure determination, of proteins and multisubunit complexes, 758–760, 758f, 759f, 760f Subcellular fractionation, 275–276, 276f, 752 Substrate-level phosphorylation, 111f, 112f, 113 oxidative phosphorylation compared with, 189 Substrates of enzymes, 95–96 complex with enzyme, 86f, 97–98, 97f enzyme impact on, 98f, 99–102 in reaction rate, 103f Substrate reduction therapy, for lysosomal storage disorders, 307HP Substratum cell adhesion to, 235f, 236f, 242f, 243, 247–250, 249f focal adhesions and, 248f in cell locomotion, 374, 376–379, 376f, 379f Subunits, of proteins, 60 Succinate dehydrogenase, 194f structure and function, 196 Sucrose, 45 synthesis in plants, 227f, 228, 228f D-Sugar, 44, 44f L-Sugar, 44, 44f Sugars, 43–47. See also Carbohydrates disaccharides, 45, 45f linking together, 44–45, 45f, 46f monosaccharides, 43 in nucleosides, 394fn in nucleotides, 77f, 393f, 394, 394fn oligosaccharides, 45 as precursors, 42 in RNA compared with DNA, 394fn structure of, 43–44, 43f water interaction with, 38 Suicide of cells. See Apoptosis Sulfa drugs, 106HP Sulfhydryl group in cysteine, 53 mercaptoethanol and, 63 Sulfide ions, in iron-sulfur centers, 192, 192f Sulfur bacteria, photosynthesis in, 212f, 214 Sunlight damage from, 35HP as energy source, 5, 217 Supercoiled DNA, 397-398, 430f, 550 Superfamilies of proteins, 76–77, 407–408 Superoxide dismutase (SOD), 35HP Superoxide radical (O2•⫺), 35HP Super-resolution fluorescence microscopy, 740, 740f Surface area of cell, 17 Survival, cell, signaling role in, 660 SV40. See Simian virus 40 Symport, 161 Synapses, 168–171, 168f dysfunction, nervous system diseases and, 171 effects of drugs, 170 Synapsis,of homologous chromosomes, 604–605, 605f Synaptic cleft, 168–169, 168f, 169f Synaptic plasticity, 170–171 Synaptic transmission, 168–171 sequence of events, 169–170, 169f Synaptic vesicles, 168f, 169 in neurotransmission, 169–170, 169f SNARE docking proteins and, 302 Synaptonemal complex (SC), 605, 606f Synaptotagmin, 59f Synonymous nucleotide changes, 464, 464f

I-34 INDEX Syntelic attachments, of microtubules, 589f, 596–597, 596f Synthetic biology, 17–19, 19f Systemic lupus erythematosus (SLE) autoimmunity and, 725HP–726HP, 725HPf Sm proteins and, 451 Systems closed compared with open, 94 energy changes, 87–88, 88f entropy and, 89f internal energy, 87–88 T Tails, of histone proteins, 495-496, 499-501 Talin, binding to integrin, 246, 246f, 248f Tamoxifen, 669 Tandem DNA repeats, 401–402 mechanism of, 407, 407f use in identifying persons, 402f Taq polymerase, 769–771, 770f Targeted therapies, for cancer, 687–693 Targeting of proteins, 272, 299–300, 299f signals for, 272, 300, 316 of vesicles, 300–303, 301f specificity of, 300, 302 Taste, GPCR role in, 635–636 TATA-binding protein (TBP), 442, 442f, 442fn TATA box, 442, 442f, 443f in transcription regulation, 522, 523f tau, in AD, 70HP Tau microtubule-associated protein, 331 Taxol, effects on microtubules, 341 Taxonomic classification evolutionary relationship of, 27EP–28EP nucleotide sequence and, 28EPt of prokaryotes, 14 sequencing approach, 28EP–29EP Tay-Sachs disease, 306HP–307HP, 306HPt T bacteriophages, structure, 24 TBP. See TATA-binding protein TBP-associated factors (TAFs), 442f TCA cycle. See Tricarboxylic acid cycle T cells activation and mechanism of action of, 707–710, 707f, 708f, 709f, 709fn APC activation of, 722–723, 722f APC interactions with, 717, 718f, 720–721 in autoimmune disease, 724HP, 726HP B cell activation by, 722f, 723 cancer immunotherapy using, 689 cytotoxic, 709 APC interactions with, 718f, 720 development of, 721 division of, 699f DNA rearrangements producing genes encoding antigen receptors of, 713–716, 714f, 715f helper, 709–710, 709f, 709fn APC activation of, 722–723, 722f APC interactions with, 718f, 720–721 B cell activation by, 722f, 723 development of, 721 in immune response, 701f, 703, 703f memory, 708 MHC interaction with, 727EP–728EP, 727EPt, 728EPt regulatory, 710 selection of, 721, 721f T-cell receptor (TCR), 707–708, 716, 716f DNA rearrangements producing genes encoding, 713–716, 714f, 715f

MHC interactions with, 728EP–730EP TCR. See T-cell receptor T-DNA transformation, 777, 777f Telomerase, 506, 507f in cancer development, 666, 669–670, 682f Telomeres, 505–509, 506f, 507f, 508f Telophase of meiosis, 607f, 608 of mitosis, 597, 597f Temperature absolute temperature, 89, 91 enzyme–catalyzed reaction and, 103–104, 104f fatty acid saturation and, 138–139, 138fn membrane fluidity and, 138–139, 138f transition temperature, of lipid bilayer, 138, 138f Temperature-sensitive mutants, 275f in protein transport studies, 275, 275f in DNA replication studies, 549 in cell cycle studies, 578 Tendons, collagens in, 240 Teprotide, 104 Teratoma from embryonic stem cells, 21HP from iPS cells, 22HP–23HP Terminal knob of axon, 164, 164f, 168f, 169 in synaptic transmission, 169f, 170 Terminal meiosis, 603, 603f Termination, in protein synthesis, 474 Termination codons, 463, 474fn mutations and, 474–475 premature, 474–475 Tertiary structure of proteins, 57–58, 57f of myoglobin, 58–59, 58f primary structure and, 74, 74f self-assembly of, 64, 64f of PrPC protein, 66HP, 67HPf of rRNA, 429, 429f Testosterone, 49, 49f cancer and, 669 “Test-tube evolution,” 455 Tetanus, vaccination against, 706–707 Tethering proteins, for joining vesicles and targets, 301–302, 301f Tetracyclines, 106HP, 107HPt Tetrads. See Bivalents Tetranucleotide theory, of DNA, 394, 420EP Tetroses, 43 TFIID transcription factor, 442–443, 442f, 442fn, 526 TFIIH transcription factor in DNA repair, 565, 566f enzymatic activities, 442–443 TGN. See Trans-Golgi network TH cells. See Helper T cells Thermodynamically favorable events, 88 Thermodynamically unfavorable processes, 90 Thermodynamics of chemical reactions, 90–91 kinetics compared with, 95, 192–193 laws of, 87–89 of metabolic reactions, 91–92 Thermophiles, 14 Thick filaments, of muscle fibers, 366, 366f structure of, 368 Thin filaments. See also Actin filaments in excitation-contraction coupling, 370–371, 371f of muscle fibers, 366, 366f myosin II as motor, 369

sliding during contraction, 368, 368f, 369f structure, 368, 368f Thioredoxin, in chloroplast enzyme control, 228–229, 229f 3d generation sequencing, 773 Three-dimensional culture, 751, 751f Threonine (Thr, T), 52f, 53 D-Threose, 44f L-Threose, 44f Threshold of action potential, 166, 166f Thrombin, 47 Thrombus. See Blood clotting Thylakoids, 214 Thylakoid membranes, 214, 214f cyanobacteria and, 14f enzyme localization, 122f light-dependent reactions in, 224f Thylakoid transfer domain, in chloroplast protein uptake, 318, 318f Thymine (T), 78, 393f, 394 base pairing, 395f–396f DNA repair and, 567 structure, 78f Thymosins, and actin polymerization, 372 Thymus, 700f, 703, 703f T-cell development in, 721, 721f Thymus-independent antigens, 705 Thyroid adenomas, 626HP Thyroiditis, autoimmunity and, 725HP Tic complexes, in chloroplast protein uptake, 318 Tight junctions (zonulae occludens), 258f, 261–262, 261f molecular composition, 262f TIM complexes, 316 Ti plasmid, 777, 777f TIRF microscopy, 329 Tissues, 235-236 formation role of cadherins, 254 role of cell-cell recognition, 250–251, 251f organization, 236f Tissue culture. See Cell cultures Titin, 51, 368, 369f Tobacco mosaic virus (TMV), 23–24, 23f self-assembly, 79–80 Toc complexes, in chloroplast protein uptake, 318 Tolerance, immunologic, 706, 724HP Toll-like receptors (TLRs), 700–702, 702f discovery of, 700–701, 701f drugs targeting, 702 TOM complexes, 316 Tomograms, 744 Tonoplast of plant vacuole, 308 Topoisomerases, 398, 399f, 430f, 550, 583 Torpedo marmorata, 171EP–172EP, 172EPf Tovaxin, 726HP Toxoids, 706 TP53 gene. See also p53 in cancer development, 682 in colon cancer, 683–684, 683f, 684f TP53 gene mutations, 675–678, 675f, 676f, 677f, 679f Traction forces in cell locomotion, 378, 379f from focal adhesions, 248–249, 248f Trans-autophosphorylation by RTKs, 637f, 638 Transcription, 428, 428f, 430f, 476f chromatin and, 513 coordination with pre-mRNA processing, 453f in eukaryotes, 433–434 initiation, 441–442, 442f, 443f

INDEX I-35 glucocorticoid receptor in, 524–525, 525f ncRNA and, 461 overview of, 429–434 posttranscriptional modifications, in tRNA bases, 465 in prokaryotes, 432–433 elongation complex, 432 eukaryotes compared with, 433–434 initiation, 432f, 433, 433f termination, 433 regulation by microRNAs, 460 repression of, posttranscriptional silencing, 456 RNA chain elongation, 430f RNA polymerase in, 430f Transcriptional activators, 518 enhancers, promoters, and coactivators, 525–530, 526f Transcriptional control, of gene expression, 514–533, 514f, 516f, 517f DNA sites in, 522–525, 522f, 523f transcription factors for, 517–522, 518f, 522f Transcriptional elongation complex, 432 Transcriptional repression, 530–533, 530f DNA methylation, 530f, 531–532, 531f genomic imprinting, 531f, 532 long noncoding RNAs, 530f, 532–533, 533f model for, 530f Transcription bubble, 430f Transcription-coupled pathway, of DNA repair, 565 Transcription factors, 517 activation of, 527–528, 528f for binding RNA polymerase to promoter, 430 binding sites of, 518, 518f coactivators basal transcription machinery and, 526 chromatin structure and, 526–530, 527f, 528f, 529f deletion mapping, 523 DNA footprinting, 523 ES cells and, 519 in eukaryotes, 434 evolution of humans and, 414 in gene expression, 517–519, 518f genome-wide location analysis, 523–524, 524f Huntington’s disease and, 405HP iPS cells and, 519 motifs, 519–522 HLH, 521, 521f leucine zipper, 522 zinc-finger, 520–521, 520f oncogenes encoding, 680 phenotype and, 518–519, 519f for RNA polymerases, 442fn RNA polymerase II, 441–442, 442f RTK interaction with, 639f, 640 structure of, 518–522, 518f, 520f TATA-binding protein, 442fn types of, 517–518 Transcription profiling. See gene-expression analysis Transcription unit, 434 nonribosomal, 448f Transcriptome, 461 Transdifferentiation of stem cells, 21HP, 23HP Trans double bonds, 48 Transduction, of energy, 87–88, 87f, 122 Transfection, 776 Transfer RNA (tRNA), 429, 464–468 attaching to amino acids, 466–468 linkage site, 465f attaching to ribosomes, 470, 471f complementarity, 465–466, 465f

deacylated, 473–474, 473f in elongation of nascent polypeptide, 470–474, 472f folding, 465 in initiation of protein synthesis, 468–469, 469f interaction with mRNAs, 466, 467f invariant bases, 465, 465f modified bases, 465–466, 465f structure of, 465–468, 465f, 466f synthesis of, 440, 440f Transformation of bacteria, 421EP–422EP, 421EPf Transforming growth factor-␤2 (TGF-␤2), quaternary structure of, 60, 61f Transgenes, 455f, 776 Transgenic animals, 776–778, 777f Transgenic plants, 776–778, 777f Trans-Golgi network (TGN), 290, 291f, 296f, 299f, 300f endocytic pathway and, 313f protein sorting at, 299–300 Transition state, in enzymatic reactions, 96–97, 96f Transition temperature, of lipid bilayer, 138, 138f Transit peptide, in chloroplast protein uptake, 318 Translation, 428f, 429, 464–465, 468–477, 476f control of, 536–540, 536f microRNA in, 539–540, 540f mRNA stability and, 538–539, 539f eukaryotic compared with prokaryotic, 476, 476f inhibition of by microRNAs, 460 by RNA interference, 456f initiation of, 468–471, 536–537, 537f in eukaryotes, 469–470, 469f in prokaryotes, 468–469, 469f mRNA localization, 537–538, 538f in rough ER, 273, 274f, 280-284 termination of, 474 Translesion synthesis (TLS), 569 Translocations, human disorders and chromosomal, 504HP–505HP, 505HPf, 690 of nascent polypeptides, during elongation, 472–473, 472f, 473f Translocation complexes in chloroplast protein uptake, 318, 318f in mitochondrial protein uptake, 316, 317f Translocons of chloroplast membranes, 318, 318f of rough endoplasmic reticulum, 282–283, 282f orientation of membrane proteins and, 284, 284f structure, 283 Transmembrane domains, of integral membrane proteins, 134–135, 134f Transmembrane proteins integration into membranes, 283–284, 284f multispanning, 284 in nucleus, 489, 489f orientation in membranes, 284, 284f, 285f Transmission electron microscopy (TEM), 3, 740–742, 741f, 742f specimen preparation for, 742–743, 743f cryofixation and frozen specimens, 743–744 freeze-fracture replication and freeze etching, 745–746, 745f, 746f negative staining, 744, 744f shadow casting, 744–745, 744f, 745f Transpeptidases, 106HP Transplants in cell replacement therapy, 20HP-23HP hematopoietic stem cells, as autoimmune disease therapy, 726HP

immunologic rejection of, MHC role in, 716 of nuclei into other cells, 21, 513 Transport. See also Active transport anterograde, 333 axonal, 332–334, 333f role of microtubules, 333, 333f along biosynthetic/secretory pathway, 272f, 296f motor proteins and, 337f cytoskeleton and, 325f, 326, 326f along endocytic pathway, 272f intraflagellar, 350, 350f across membranes, 121–122, 122f, 147–161 microfilament-related, 356, 358 along microtubules, 326f direction of transport, 333, 337f kinesins in, 334–337, 335f, 336f role of motor proteins, 334–339 of organelles dynein-mediated, 337–339, 337f kinesin-mediated, 336–339, 336f, 337f along microtubules, 326f, 334f myosin V and, 363f passive, 148f retrograde, 333, 334f secondary active, 161, 161f in plants, 161 Transport carriers, between endoplasmic reticulum and Golgi complex, 293 VTCs, 289, 290f Transport vesicles, 271, 272f, 289 in axonal transport, 333f docking at target, 301f, 302 formation, 295–296, 296f fusion with target, 272f, 277f, 296f, 302, 303f specificity of, 300, 302 in Golgi complex, 293, 294f, 295 direction of movement, 294, 294f protein coats, 295, 297 movement toward target, 300 role of microtubules, 333 secretory protein dynamics and, 274f targeting, 300–303 tethering to target, 301–302, 301f transmembrane cargo receptors, 297f Transposable genetic elements, 409–411, 409f in evolution, 411 disease and, 410 insertion into gene, 410 meaningful compared with “junk,” 461 Transposase, 409, 409f Transposition of genetic elements, 409–411, 409f eukaryotic mechanisms, 410, 410f Transposons, 409–410, 409f structure, 409, 410f suppression by RNA interference, 457 Transverse (T) tubules, of muscle fiber, 370, 371f “Treadmilling” of actin filaments, 359f, 360 TReg cells. See Regulatory T lymphocytes Triacylglycerol, 47, 48f TRiC, 65 Tricarboxylic acid (TCA) cycle, 109f, 110, 185–186, 185f, 186f catabolic pathways and, 186f electron transfer in, 185f, 190 net equation, 186 pyruvate dehydrogenase and, 61 substrates in, 185f redox potentials, 190 Trilaminar nature, of plasma membrane, 120–121, 121f

I-36 INDEX Trinucleotide repeats, 404HP–405HP, 404HPf Trioses, 43 Triple bonds, 34 Triplets (trinucleotides) of genetic code (codons), 404HP, 462-464 variability in, 464, 466 wobble hypothesis, 466, 467f Triskelions, of clathrin, 308–309, 310f Trisomies, 608HP–609HP Trisomy 21, 609HP, 609HPf Tristearate, 48, 48f Triton X-100, solubilization of membrane proteins with, 133 tRNA. See Transfer RNA Tropomyosin, 368, 368f in muscle contraction, 370–371, 371f Troponin, 59f, 368, 368f in muscle contraction, 370–371, 371f trp operon, 485, 486f Tryptophan (Trp, W), 52f, 53 bacterial culture in, 484 Tuberculosis (TB), 106 Tubulins fluorescent labeling, in cytoskeleton studies, 327, 327f ␥-tubulin, 340f, 341 ␥-TuRCs, 340f, 341 in microtubule assembly, 342–343, 343f in microtubule nucleation, 340–341, 340f, 342f Tubulin-GDP dimers, in microtubule dynamics, 342–344, 343f Tubulin-GTP dimers, in microtubule dynamics, 342–344, 343f ␣␤ Tubulin heterodimers, 330–331, 331f, 340f, 341 assembly and disassembly, 342–343 Tumors. See also Cancer benign, 670 development of, 669–671, 670f oncogene role in, 671–672, 671f, 672f, 679–681, 679f, 682f tumor-suppressor gene role in, 671–679, 671f, 672f, 673t, 674f, 675f, 676f, 677f, 679f gene-expression analysis of, 685–687, 686f, 687f growth of, 665–667, 666f, 667f invasion of normal tissue by, 664, 665f secondary, 256HP, 256HPf solid, 669–670, 670f inhibition of cancer-promoting proteins in, 691 stem cells of, 692 therapy for, 687–688 chemotherapy, 665, 677–687 early detection, 693 gene-expression analysis guiding, 687, 687f immunotherapy, 688–689 inhibition of angiogenesis, 692–693, 693f inhibition of cancer-promoting proteins, 689–692, 690t, 691f preventive measures, 668 radiation, 665, 677, 687 targeted, 687–688 variety of mutations in, 683–684, 683f, 684f Tumorigenesis, 669 Tumor necrosis factors (TNFs), 708, 709t in apoptosis, 658, 658f in autoimmunity, 726HP in cell survival, 660 Tumor-suppressor genes, 671–672, 671f, 672f functions of, 672–679, 673t, 674f APC gene role in FAP, 678 BRCA1/BRCA2 role in breast cancer, 678, 679f PTEN, 678

RB role in cell cycle regulation, 674, 675f TP53 role in cell arrest and apoptosis, 675–677, 675f, 676f, 677f, 679f TP53 role in senescence, 677–678, 679f Tumor viruses, 26 discovery of, 694EP–697EP, 694EPf, 694EPt, 695EPf, 696EPf DNA, 668, 674 oncogenes in, 671 RNA, 671, 694EP–697EP, 694EPf, 694EPt, 695EPf, 696EPf Tunneling nanotubes, 264, 264f Turgor pressure, 150–151, 150f role of vacuoles, 308 Turnover number (catalytic constant), 95t, 103 2d generation sequencing, 773 2R hypothesis of evolution, 406–407 Two-dimensional culture, 751, 751f Type 1 diabetes (T1D) autoimmunity and, 725HP cell replacement therapy for, 20HP Type 2 diabetes (T2D), insulin receptor signaling and, 646 Tyrosine (Tyr, Y ), 52f, 53 Tyrosine kinases. See Protein-tyrosine kinases Tyrosine kinase inhibitors in cancer therapy, 75–76, 75f, 690-691 screening tests for, 75–76, 75f Tyrosine phosphorylation. See Protein-tyrosine phosphorylation Tysabri, 726HP U Ubiquinone (UQ, coenzyme Q), 191–193, 191f operation of, 194f Ubiquitin endocytosis and, 312 protein degradation and, 542, 592-593 Ubiquitin ligases, 542, 578, 592-593 Ubisemiquinone, 191f, 192 Ultracentrifugation. See also Centrifugation of nucleic acids, 760–762, 762f equilibrium centrifugation, 762, 762f velocity sedimentation, 761–762, 762f Ultraviolet (UV) radiation as carcinogen, 668 DNA damage from, 568HP–569HP XP and, 568–569 Uncoupling of oxidation and phosphorylation, 198–199 Uncoupling proteins (UCPs), 199 Underwound (negatively supercoiled) DNA, 398 in transcription, 430f Undifferentiated cells ES cells, 21HP iPS cells, 22HP–23HP Unfolded protein response (UPR), 288, 289f cell death and, 288 Unsaturated fatty acids, 48–49, 48f UPR. See Unfolded protein response Upstream compared with downstream DNA, 430f, 433, 433f Uracil, 78 DNA repair and, 567 structure, 78f Uridine, conversion to pseudouridine, 439, 439f Usher 1B syndrome, 364 UV radiation. See Ultraviolet radiation V Vaccination, 706–707 Vaccines, for Alzheimer’s disease, 69HP

for Alzheimer’s disease, 69HP for cancer, 689 Vacuolar proteins, synthesis in rough ER, 282–283 Vacuoles, 272f of plant cells, 8f–9f, 307–308, 307f Valine (Val, V), 52f, 53 Vancomycin, 106HP van der Waals forces, 37, 38f in myoglobin, 59, 59f Vectibix, 688 Vector DNA, 766, 774 Vegetable fats, 48–49 VEGF, cancer therapy targeting, 693 Velocity sedimentation, of nucleic acids, 761–762, 762f Vertebrates, evolution from invertebrates, 406 Very-long-chain fatty acids (VLCFAs), and peroxisomal disorders, 209HP Vesicles, 271fn of animal cell, 8f–9f budding (formation), 272f, 277f, 285, 285f, 296f membrane bending, 297 membrane phospholipids, 285 protein coats, 276, 277f, 295, 297 study via mutants, 277–278, 277f docking at target, 301f, 302 dynamic studies, 320EP–321EP, 321EPf fusion with target, 272f, 277f, 302, 303f specificity of, 300, 302 membranous, 275, 276f, 300 molecular motors of, 363–364, 364f movement toward target, 300 of plant cell, 8f–9f secretory, 272f study via cell fractionation, 275–276, 276f synaptic, 169–170, 169f targeting, 300–303 tethering to target, 301–302, 301f Vesicular stomatitis virus gene (VSVG) protein in Golgi transport studies, 294, 294f in protein transport studies, 274–275, 275f Vesicular transport, 271, 271fn, 272f, 295–303, 296f back to endoplasmic reticulum, 294f, 295, 296f, 298 direction of movement, 337f docking at target, 301f, 302 dynamic studies, 320EP–321EP, 321EPf dyneins and, 337f, 338–339 from endoplasmic reticulum to Golgi complex, 272f, 289, 290f, 296–298, 296f, 298f through Golgi complex, 272f within Golgi complex, 293–294, 294f direction of movement, 294, 294f vesicular transport model, 294f kinesins and, 336–339, 337f, 364f along microfilaments, 363–364, 364f along microtubules, 333f, 336–337, 337f, 364f molecular motors for, 363–364, 364f movement toward target, 300 myosins and, 363–364, 363f, 364f protein sorting and targeting, 299–300, 299f retrograde, 294f, 295, 296f, 298 tethering to target, 301–302, 301f traffic patterns, 272 from trans Golgi network to target sites, 272f, 295, 299–302, 299f vesicle docking and fusion, 302, 303f Vesicular transport model, 293–294, 294f Vesicular-tubular carriers (VTCs; membranous carriers), 289, 290f, 296f dynein-mediated transport, 337f

INDEX I-37 V genes of Igs, 713–716, 714f, 715f Video microscopy, 738–739 Villin, 373 Viral infections, 24–26, 25f in tumorigenesis, 668 lytic, 25 provirus, 25–26 RNA interference therapy, 458HP–459HP types of, 25–26 Virion, 24, 26 Viruses, 23–26. See also Bacteriophages basic properties, 23 beneficial uses, 26 in cancer development, 694EP–697EP, 694EPf, 694EPt, 695EPf, 696EPf capsids of, 23–24, 23f, 24f diversity of, 24, 24f DNA transfer into eukaryotic cells and mammalian embryos via, 775–776 genetic material of, 24, 24f genome complexity, 400, 401f host range of, 24–25 hosts of, 24 infection by, 25f replication, blocking by RNA interference, 457 structure of, 23, 23f, 24f tumors and, 26, 668 use for gene therapy, 26, 163HP Viscosity, fluidity compared with, 138fn Visibility, with light microscopes, 734–735, 735f Vision, GPCR role in, 634 Vitamins, 42 antioxidant, 35HP V(D)J recombinase, 714–715, 714f Volts, conversion to calories, 219f Voltage charge and potential compared with, 165, 165f across membranes, 164–165, 198 in action potential, 166, 166f measurement of, 152, 164–165, 165f proton electrochemical gradient and, 198 Voltage-gated ion channels, 152. See also specific ion channels in action potential, 166, 166f opening and closing of, 153–156, 154f, 156f in synaptic transmission, 169, 169f Voltage-sensing domain (voltage sensor), of potassium ion channel, 154, 154f, 155f V-type ion pumps, 159–160

W WASP. See Wiskott-Aldrich syndrome protein Water amino acid interaction with, 38 biological properties, 38 diffusion across membranes, tight junctions and, 262 diffusion through membranes, 149–151, 150f via aquaporins, 151, 151f dissociation, 40 as free radical, 35HP hydrogen bonds in, 37–38, 38f ice-water transformation thermodynamics, 90, 90f, 90t ions in, 36, 36f ion–product constant of, 40 life-supporting properties, 37–38 lipids and, 47 membrane role of, 38 nonpolar molecules in, 36f from oxygen reduction, 197f in mitochondria, 196 in photosynthesis electron flow, 220f, 221–222 as electron source, 212, 220, 224f, 225 oxidation of, 220f for oxygen formation, 215 photolysis, 221 polarity, 34 in protein structure, 38, 39f salt crystal in, 36f splitting of, 221–222 structure of, 37fn sugar interaction with, 38 thermal properties of, 38 Watson-Crick base pairs, 394–396, 395f–396f Watson-Crick model of DNA, 386f, 394–396, 395f–396f replication model, 396–397, 546, 546f Waveforms (beating patterns), of flagella, 345, 346f Weak acids and bases, 39, 39t Weak attractive forces, 37, 38f Weak ionic bonds, 36 Weed killers, and electron transport, 225 Western blot, 757, 782–783 Whole-genome duplication, 406–407, 407fn Whole mounts, for bright-field microscopy, 735 Wild type, 390 Wiskott-Aldrich syndrome protein (WASP), in cell locomotion, 377, 377f

Wobble hypothesis, 466, 467f X Xalkori, 691 X chromosome inactivation, 498f, 499 heterochromatin formation during, 500–501, 500f, 501f Xeroderma pigmentosum (XP), 568–569, 568HP X-ray crystallography, 57–58, 57f, 58f, 758–760, 758f, 759f, 760f of membrane proteins, 133–134, 133f, 623 X-ray diffraction, 758–760, 758f, 759f, 760f Y Y2H assay. See Yeast two-hybrid assay Y chromosome, maleness and, 609HP Yeast, DNA of, 10 Yeast artificial chromosome (YAC), 774 Yeast cells fermentation in, 114, 114f use in research, 277–278 Yeast two-hybrid (Y2H) assay, 755, 755f of new proteins, 63 for protein–protein interactions, 62 network of, 62–63, 63f uncertainties of, 62 Yervoy, 688 Yolk proteins, uptake by oocytes, 319EP Z Zelboraf, 691, 691f Zellweger syndrome (ZS), 208HP Zevalin, 688 Zinc-finger motif, 520–521, 520f Z line of sarcomere, 366, 366f Zonulae adherens, 257, 258f. See also Adherens junctions Z scheme of electron flow, 219, 219f Zygotene, 604, 604f, 605f Zygotes with abnormal chromosome number, 608HP chromosome number, 388f Zygotic meiosis, 603, 603f

This page is intentionally left blank

Topics of Human Interest NOTE: An f after a page denotes a figure; t denotes a table; fn denotes a footnote; HP denotes a Human Perspective box; EP denotes an Experimental Pathway box. Acquired immune deficiency syndrome. See AIDS Acute lymphoblastic leukemia (ALL), Chapter 16, 685–686 Acute myeloid leukemia (AML), Chapter 16, 685–686 Adaptive (acquired) immune response, Chapter 17, 703–724 Adenoviruses, Chapter 1, 24, Chapter 4, 163, Chapter 11, 444–446 Adrenoleukodystrophy (ALD), Chapter 5, 208HP African populations, genomes of, Chapter 10, 402, 419HP Agammaglobulinemia, Chapter 17, 703 Aging: and Down syndrome (trisomy 21), Chapter 14, 609HP and free radicals, Chapter 2, 35HP and insulin-like growth factors, Chapter 15, 646 and mitochondrial disorders, Chapter 5, 208HP premature (progeria), Chapter 12, 490, 608, Chapter 13, 569HP and telomeres, Chapter 12, 506–508 AIDS (acquired immune deficiency syndrome): and helper T cells, Chapter 17, 709, 717 resistance, Chapter 15, 626HP, Chapter 17, 717 resistance to drugs, Chapter 2, 74–75, Chapter 3, 106–108HP therapies for, Chapter 11, 458HP ALD (adrenoleukodystrophy), Chapter 5, 208–209HP Alzheimer’s disease (AD), Chapter 2, 66–70HP, Chapter 10, 418HP, Chapter 15, 667 Anesthetics, Chapter 4, 167–169 Aneuploidy, Chapter 14, 584, 608–609HP, Chapter 16, 666–667 Antacid medications, Chapter 4, 159–160 Antibiotics, Chapter 3, 106–108HP, Chapter 11, 474 Antidepressants, Chapter 4, 169 Anti-inflammatory drugs, and cancer, Chapter 16, 669 Antioxidants, Chapter 2, 35HP Appetite, Chapter 3, 117 Arthritis, rheumatoid, Chapter 17, 724–726HP Ataxia-telangiectasia, Chapter 14, 579–580 Atherosclerosis, Chapter 8, 307–308, 313–315EP, Chapter 10, 417–418HP Autoimmune diseases, Chapter 7, 250, 257, Chapter 17, 706, 721, 724–726HP Bacterial toxins, Chapter 8, 302, Chapter 15, 627 Bacteriophage therapy, Chapter 1, 26

Benign tumors, Chapter 16, 670 Biofilms, Chapter 1, 13, Chapter 4, 163 Biomarkers, Chapter 2, 72–73, Chapter 16, 693 Blistering diseases, Chapter 7, 250, 257, Chapter 9, 356 Blood-brain barrier, and tight junctions, Chapter 7, 262 Blood cell differentiation, Chapter 17, 703f Blood clots, Chapter 2, 47, Chapter 7, 246–247 Blood glucose, Chapter 3, 117, Chapter 4, 157, Chapter 15, 631–632, 644–646 Blood group (blood type), Chapter 4, 129, 130f, Chapter 10, 416, Chapter 17, 716 Bone marrow, in immune system, Chapter 17, 699, 700f, 703, 721, 724HP Bone marrow transplantation, Chapter 1, 20HP, Chapter 8, 307HP Booster shots, Chapter 17, 707 Breast cancer: BRCA mutations, Chapter 16, 678–679, 692 DNA microarray data, use of, Chapter 16, 686 immunotherapy, Chapter 16, 688 incidence, Chapter 16, 665f risk factors, Chapter 16, 668–669, 678, 679, 693 Burkitt’s lymphoma, Chapter 12, 521, Chapter 16, 668, 680 Calorie-restricted diet, and life span, Chapter 15, 647HP Cancer, Chapter 16, 664–698 causes, Chapter 16, 667–669 and cell adhesion, Chapter 7, 257HP and cell cycle regulation, Chapter 14, 579–580, 586–587, 596 and cell senescence, Chapter 16, 670, 678, 682f and cell signaling, Chapter 16, 678–680 and chromosomal aberrations, Chapter 12, 504–505HP, Chapter 16, 666, 667f, 692 and DNA repair genes, Chapter 13, 569–570HP, Chapter 16, 681 epidemiology, Chapter 16, 668 gene expression analysis, Chapter 16, 685–687 genetics, Chapter 16, 668–687 genome, Chapter 16, 683–684 and growth factor receptors, Chapter 16, 678, 679, 688, 693 and inflammation, Chapter 16, 668 inherited syndromes, Chapter 16, 673t metastatic spread, Chapter 7, 256HP, Chapter 16, 692–693 and mismatch repair, Chapter 16, 673t normal vs. malignant cell properties, Chapter 16, 665–667 and oncogenes, Chapter 16, 671, 679–681, 691, 694–697EP

and TP53 gene, Chapter 16, 678 and RB gene, Chapter 16, 672–674 risk factors, Chapter 16, 668 and telomeres, Chapter 12, 508 therapy, Chapter 10, 398, Chapter 11, 458HP, Chapter 16, 687–691 and tumor-suppressor genes, Chapter 16, 671–673 use of DNA microarray data in diagnosis and treatment of, Chapter 16, 685–687 and viruses, Chapter 16, 668–669, 694–697EP Carcinogens, Chapter 8, 280, Chapter 16, 668, 676, 694EP Cell-mediated immunity, Chapter 17, 703, 706–709, 716–723 Cell replacement therapy, Chapter 1, 20–23HP, Chapter 12, 518 Cervical cancer, Chapter 16, 668, 670, 693 Chemotherapy drugs, Chapter 4, 129f, Chapter 9, 341, Chapter 10, 398, 418HP, Chapter 16, 677f Cholera, Chapter 4, 163, Chapter 15, 627 Cholesterol: and familial hypercholesterolemia, Chapter 8, 319–320EP and LDL, atherosclerosis, Chapter 8, 313–314, Chapter 10, 417HP, 418HP, Chapter 11, 458HP, Chapter 12, 535 Chromosome alterations and aberrations: and apoptosis, Chapter 16, 681 deletions, and retinoblastoma, Chapter 16, 673 duplications, Chapter 10, 407, 407f, 409–410, Chapter 12, 504–505HP and myc oncogene, Chapter 16, 680, 682f nondisjunction, Chapter 14, 608–609HP and oncogene activation, Chapter 16, 657, 657f Chronic myelogenous leukemia (CML), Chapter 2, 75–76, Chapter 16, 690 Ciliopathies, Chapter 9, 349–350HP Clonal selection theory, Chapter 17, 704–706 Cloning, Chapter 12, 512–514 Cockayne syndrome, Chapter 13, 569–570HP Collagen, diseases of, Chapter 7, 238–239 Colon cancer: and anti-inflammatory drugs, Chapter 16, 668 gene mutations in, Chapter 16, 683f hereditary nonpolyposis, Chapter 16, 683 and mismatch repair, Chapter 16, 683 and tumor-suppressor genes, Chapter 16, 678 Color blindness, Chapter 12, 499, Chapter 15, 626 Congenital Diseases of Glycosylation (CDGs), Chapter 8, 286–287

Topics of Human Interest Creutzfeldt-Jakob disease (CJD), Chapter 2, 66HP Cystic fibrosis, Chapter 4, 162–163HP, Chapter 8, 282, Chapter 11, 475 Deafness, and myosin mutations, Chapter 9, 364 Diabetes, Chapter 10, 417–418HP Diabetes insipidus, Chapter 4, 151, Chapter 15, 626HPt Diabetes, type 1, Chapter 1, 20, Chapter 17, 724–726HP Diabetes, type 2, Chapter 10, 418, Chapter 15, 646 Diarrhea, and osmosis, Chapter 4, 150, Chapter 15, 627 Diet, and cancer, Chapter 16, 668 DNA fingerprinting, Chapter 10, 402 DNA repair, Chapter 13, 564–568 Down syndrome (trisomy 21), Chapter 12, 505HP, Chapter 14, 608–609HP Drug development, Chapter 2, 75–76, Chapter 8, 301, Chapter 16, 689–691, Chapter 17, 725–726HP Dwarfism, Chapter 7, 240

(continued)

Hemophilia, from “jumping” genetic elements, Chapter 10, 409 Herceptin, Chapter 16, 688 Herpes viruses, Chapter 16, 669 HIV (human immunodeficiency virus), Chapter 1, 24–25 and helper T cells, Chapter 17, 709 Human Genome Project, Chapter 10, 420HP Human papillomaviruses (HPV), Chapter 16, 669, 678 Humoral immunity, Chapter 17, 703, 710–716, 723–724 Huntington’s disease, Chapter 10, 404–405HP, 417HP, Chapter 15, 657HP Hydrocephalus, Chapter 7, 253 Hypertension, Chapter 3, 104, Chapter 15, 626

Gaucher’s disease, Chapter 8, 306HP Gene number, Chapter 10, 411–412 Gene therapy, Chapter 1, 23–24, Chapter 4, 163 Genomic analysis, human Chapter 10, 411–420 Gleevec, Chapter 2, 75–76, Chapter 16, 691 Glycolipids, diseases of, Chapter 4, 126, Chapter 8, 306–307HP Graft rejection, Chapter 17, 716–717

I-cell disease, Chapter 8, 306HP Immune response, Chapter 17, 699–730 adaptive (acquired ), Chapter 17, 703, 704, 707–710, 722–724 innate, Chapter 17, 700–703 overview, Chapter 17, 699–704 primary, Chapter 17, 711f secondary, Chapter 17, 711f against self, Chapter 17, 699, 706, 720, 721, 724–726HP Immune system, Chapter 17, 699–730 Immunization, Chapter 17, 706–707 Immunotherapy, Chapter 2, 67, 68, Chapter 16, 688–689 Inborn errors of metabolism, Chapter 11, 427 Induced pluripotent stem cells, Chapter 1, 22–23, Chapter 12, 518–519 Infections: bacterial, adaptive immune responses, Chapter 17, 703, 706–707 bacterial, as a cancer-causing agent, Chapter 16, 669 bacterial, innate immune responses, Chapter 17, 700–703 protective mechanisms, Chapter 17, 700–703 resistant bacterial, Chapter 3, 106–108HP Inflammation, Chapter 7, 255HP, Chapter 17, 702, 710, 725–726HP Influenza, Chapter 1, 25 Innate immune responses, Chapter 17, 700–703 Insulin signaling, Chapter 4, 157, Chapter 15, 644–645, Chapter 17, 724HP Interferons (IFNs), Chapter 17, 703, 707, 724, 725–726HP Interleukins (ILs), Chapter 17, 708, 709t, 723

Heart attacks, heart disease, Chapter 4, 159, Chapter 7, 246–247, 255HP and nitroglycerine, Chapter 15, 656 Heart muscle: contraction, and gap junctions, Chapter 7, 263–264, Chapter 15, 649 and miRNAs, Chapter 12, 539–540 Heartburn, Chapter 4, 159, 159f Hemolytic anemias, Chapter 4, 147

Kaposi’s sarcoma, Chapter 15, 626HP, Chapter 16, 669 Kartagener syndrome, Chapter 9, 349HP Karyotypes, Chapter 12, 503f, Chapter 14, 609f, Chapter 16, 667f Kidneys: failure from diabetes, Chapter 7, 237 polycystic disease, Chapter 9, 349HP tight junctions, Chapter 7, 262

Embryonic development: cell movements, Chapter 7, 243f, Chapter 9, 381 cilia, Chapter 9, 349HP and epithelial-mesenchmal transitions, Chapter 7, 257 and genomic imprinting, Chapter 12, 532 and miRNAs, Chapter 12, 539–540 Embryonic stem (ES) cells, Chapter 1, 21–22, Chapter 18, 779–780 Enzyme replacement therapy, Chapter 8, 307HP Epstein-Barr virus, Chapter 16, 682 Exercise, Chapter 5, 188HP Fabry disease, Chapter 8, 306HPt Fragile X syndrome, Chapter 10, 405HP Free radicals and aging, Chapter 2, 35HP

Lactose tolerance, Chapter 10 Leukemias: and chromosomal translocations, Chapter 11, 458HP, Chapter 12, 504–505HP, Chapter 16, 690–691 and gene-expression profiling, Chapter 16, 686–687 Leukocyte adhesion deficiency (LAD), Chapter 7, 255–256HP Listeria, Chapter 9, 374 Longevity, Chapter 2, 35HP, Chapter 5, 208–209, Chapter 15, 646 Lysosomal storage disorders, Chapter 8, 306–307HP Macular degeneration, Chapter 10, 418HP, Chapter 11, 452HP Mad cow disease, Chapter 2, 66HP Malaria, Chapter 17, 717 Marijuana, Chapter 4, 170 Marker chromosomes, Chapter 12, 509 Melanoma, Chapter 13, 570, Chapter 16 Metabolism, anaerobic and aerobic, Chapter 5, 188HP Metastasis, Chapter 7, 256HP, Chapter 16, 665f, 687, 693 Microbiome, human, Chapter 1, 15 Mitochondrial diseases, Chapter 5, 207–208HP Multiple sclerosis (MS), Chapter 4, 168, Chapter 17, 724–726HP Muscle fibers and contractility, Chapter 5, 188HP, Chapter 9, 364–371 Muscular dystrophies, Chapter 4, 147, Chapter 11, 475, Chapter 12, 490 Mutagenic agents, Chapter 16, 667–668, 675 Mutations: in cancer, Chapter 16, 669–687 and mitochondrial disorders, Chapter 5, 207–208HP and radiation, Chapter 10, 392, Chapter 13, 567 in rearranged antibody DNA, Chapter 17, 715 and splicing, Chapter 11, 449 in tumor-suppressor genes vs. oncogenes, Chapter 16, 671 Nerve cells, mitochondrial abnormalities, Chapter 5, 207–208HP Nerve gas, Chapter 3, 104, Chapter 4, 170 Nervous system disorders, Chapter 5, 207–208HP, Chapter 9, 356, Chapter 10, 404–405HP, Chapter 16, 673 Neurofibrillary tangles (NFTs), Chapter 2, 70, Chapter 9, 331 Nicotine addiction, Chapter 4, 171EPfn Niemann-Pick type C disease, Chapter 8, 306t, 318 Non-Hodgkin’s B-cell lymphoma, Chapter 16, 688 “Nonself,” Chapter 17, 700 Nonsteroidal anti-inflammatory drugs (NSAIDs), Chapter 16, 669 Ovarian cancer, Chapter 16, 665f, 678

Tuberculosis, Chapter 3, 106HP, Chapter 8, 309 Tumor necrosis factors (TNFs), Chapter 15, 658–659, Chapter 17, 708, 725HP

Radiation, as a carcinogen, Chapter 10, 392, Chapter 13, 564–568, Chapter 14, 579–580 Retinoblastoma, Chapter 16, 673–675 Retinitis pigmentosum, Chapter 15, 625, 626t Retroviruses (RNA tumor viruses), Chapter 16, 668, 671, 679, 694–697EP Rheumatoid arthritis, Chapter 17, 724–726HP RNA interference, clinical applications, Chapter 11, 458–459HP

Sex chromosomes, abnormal number of, Chapter 14, 609HP Sexual arousal, Chapter 15, 656 Sickle cell anemia, Chapter 2, 55, Chapter 11, 462 Skin: blistering diseases, Chapter 7, 250, 257, Chapter 9, 356 cancers, Chapter 13, 569–570HP grafts, Chapter 17, 716 histology, Chapter 9, 357f tight junctions, Chapter 7, 260–262 Smell (olfaction), Chapter 15, 634–635 Smoking, Chapter 4, 171EPfn, Chapter 16, 669 Snake venom, Chapter 3, 104, Chapter 4, 172EP Speech and language disorders, Chapter 10, 414 Sphingolipid storage diseases, Chapter 8, 306HP Spongiform encephalopathy, Chapter 2, 66HP Statin drugs, Chapter 2, 69HP, Chapter 8, 314 Stem cells, Chapter 1, 20–23HP, Chapter 13, 564, Chapter 16, 670, 692–693 Stroke, Chapter 7, 246, 255HP Systemic lupus erythematosus (SLE), Chapter 17, 725–726HP

Scurvy, Chapter 7, 238 “Self,” Chapter 17, 700, 721, 724–726HP, 730EP antibodies against, Chapter 17, 701f, 724–726HP distinguishing from nonself, Chapter 17, 721, 724, 726HP immunologic tolerance, Chapter 17, 706, 709f, 721

Taste (gustation), Chapter 15, 634 Tay–Sachs disease, Chapter 8, 306–307HP Testosterone, Chapter 2, 49, Chapter 15, 626HP, 658 Thymus gland, Chapter 17, 703, 721, 724HP Tolerance, immunologic (towards “self ”), Chapter 17, 706, 709f, 721 Trans fats, Chapter 2, 49 Transplant rejection, Chapter 17, 716, 721

Weed killers, Chapter 6, 225 Whooping cough, Chapter 15, 627

Pap smear, Chapter 16, 670, 670t, 693 Parkinson’s disease, Chapter 5, 208HP, Chapter 15, 657 Periodontal disease, Chapter 7, 257HP Peroxisomal diseases, Chapter 5, 208–209HP Polycystic kidney disease, Chapter 9, 349HP Prader-Willi syndrome, Chapter 12, 532 Precocious puberty, Chapter 15, 626HP Pregnancy, IgG-based immunity, Chapter 17, 713 Prilosec, Chapter 4, 159, 159f Prions, Chapter 2, 66–67HP Prostate cancer, Chapter 2, 73, Chapter 15, 658 Prozac, Chapter 4, 170

Ultraviolet light, DNA damage from, Chapter 13, 564–570, Chapter 16, 667 Vaccination, Chapter 17, 706–707 Viagra, Chapter 15, 656 Viruses, Chapter 1, 23–26, Chapter 16, 667–668 acquired immune responses to, Chapter 17, 701f, 703 and cancer, Chapter 16, 667–668, 694–697EP innate immune responses to, Chapter 17, 701f, 702–703 interactions with T cells, Chapter 17, 707, 716–718, 727–728EP and oncogenes, Chapter 16, 694–697EP provirus, Chapter 1, 25–26, Chapter 16, 694EP resistance to, and interferon, Chapter 17, 701f, 703, 724 treatment with RNAi, Chapter 11, 458HP Vision, Chapter 15, 623, 624, 625–626HP, 634 Vitamin C deficiency, Chapter 7, 238

X chromosome inactivation, Chapter 12, 499–500, 509 Xeroderma pigmentosum (XP), Chapter 13, 569–570HP Zellweger syndrome (ZS), Chapter 5, 208HP